Learning to solve geometric construction problems from images
J. Macke
J. Sedlář
M. Olsak
J. Urban
J. Sivic
[Paper]
[GitHub]
[Supplementary material]

Abstract

We describe a purely image-based method for finding geometric constructions with a ruler and compass in the Euclidea geometric game. The method is based on adapting the Mask R-CNN state-of-theart visual recognition neural architecture and adding a tree-based search procedure to it. In a supervised setting, the method learns to solve all 68 kinds of geometric construction problems from the first six level packs of Euclidea with an average 92% accuracy. When evaluated on new kinds of problems, the method can solve 31 of the 68 kinds of Euclidea problems. We believe that this is the first time that purely image-based learning has been trained to solve geometric construction problems of this difficulty.



Diagram of our approach The goal is to construct a triangle given three points. The input is an RGB image with the current state of the construction in the red channel and the goal in the green channel. The Mask R-CNN predicts a line between two of the three points. The dashed rectangle denotes the bounding box of the line and the small square magenta masks denote the two points on the line. This Mask R-CNN line detection is then converted into a Euclidea tool action, Line(A, B), represented by the tool name and its arguments.


Example solution:


Example of a computed construction: Regular hexagon by the side. Following figures describes step by step construction. Each step is on one row, and contains current progress on the left and the Mask R-CNN prediction for a new step on the right. The red denotes the current state of the construction, the green color denotes the remaining goal and the blue color denotes new circle/line that will be constructed based on the detection. Other colors mark prediction masks, bounding boxes, classes and scores for each detected object.
a) Level definition and first input for the network. Construct a regular hexagon given by the side. The green color denotes the goal and the red denotes the current state of the construction. b) Prediction of the network. Based on this prediction, a circle will be constructed.
c) Step 1. d) Prediction for step 2. Based on this prediction, a line will be constructed.
e) Step 2. f) Prediction for step 3. Based on this prediction, a line will be constructed.
g) Step 3. h) Prediction for step 4. Based on this prediction, an angle bisector will be constructed. Note that there is an extra detection of parallel through point. Later in construction score of this prediction will increase and the prediction will be used.
i) Step 4. The parallel line constructed part of the goal. j) Prediction for step 5. Based on this prediction, a parallel line will be constructed.
k) Step 5. The parallel line constructed another part of the goal. l) Prediction for step 5. Based on this prediction, a parallel line will be constructed.
m) Step 6. The perpendicular bisector constructed another part of the goal n) Prediction for step 5. Based on this prediction, a perpendicular bisector will be constructed.
o) Step 7 - level successfully finished, the whole goal has been constructed.



Paper and Supplementary Material

J. Macke, J. Sedlář, M. Olšák, J. Urban, J. Sivic.
Learning to solve geometric construction problems from images




@inproceedings{geometry_reasoning2021,
    title={Learning to solve geometric construction problems from images},
    author={J. Macke and J. Sedlar and M. Olsak and J. Urban and J. Sivic},
    year={2021}

}
[Bibtex]


Acknowledgements

This work was partly supported by the European Regional Development Fund under the projects IMPACT and AI&Reasoning (reg. no. CZ.02.1.01/0.0/0.0/15_003/0000468 and CZ.02.1.01/0.0/0.0/15_003/0000466) and the ERC Consolidator grant SMART no. 714034.