I.V. Zhabokrytskyi
Èlektron. model. 2022, 44(5):73-89
https://doi.org/10.15407/emodel.44.05.073
ABSTRACT
The dynamics of the development of modern society and the rapid breakthrough of the technological component led to the need to interact with fast-changing and client-oriented information in real time. This need is met through the use of augmented reality technology, which allows users to interact in real time with both the real physical and virtual digital worlds. The rapid digitization of human existence has provoked an exponential increase in the amount of existing data, thereby posing new challenges to the scientific community. At the same time, the technology of deep learning, which is successfully applied in various fields, has a rather large potential. The purpose of this study is to present the potential of combining technologies of augmented reality and deep learning, their mutual improvement and further application in the development of modern highly intelligent programs. The work briefly provides an understanding of the concepts of augmented and mixed reality and also describes the technology of deep learning. Based on the literature review, relevant studies on the development of augmented reality applications and systems using these technologies are presented and analyzed. After discussing how the integration of deep learning into augmented reality increases the quality and efficiency of applications and facilitates the daily life of their users, conclusions and suggestions for future research are provided.
KEYWORDS
augmented reality; machine learning; deep learning; neural networks, virtual reality.
REFERENCES
- Dunleavy, M. (2014), “Design principles for augmented reality learning”, TechTrends, Vol. 58, 1, pp. 28-34.
https://doi.org/10.1007/s11528-013-0717-2 - Enyedy, N., Danish, J.A. and DeLiema, D. (2015), “Constructing liminal blends in a collaborative augmented-reality learning environment”, Int. J. Comput.-Support. Collaborat. Learn, Vol. 10, no. 1, pp. 7-34.
https://doi.org/10.1007/s11412-015-9207-1 - Lee, K. (2012), “Augmented reality in education and training”, TechTrends, Vol. 56, no. 2, pp. 13-21.
https://doi.org/10.1007/s11528-012-0559-3 - Chen, P., Liu, X., Cheng, W. and Huang, R. (2017), “A review of using augmented reality in education from 2011 to 2016”, Innovations in smart learning, pp. 13-18.
https://doi.org/10.1007/978-981-10-2419-1_2 - Di Serio, Á., Ibáñez, M.B. and Kloos, C.D. (2013), “Impact of an augmented reality system on students’ motivation for a visual art course”, Comput. Educ., Vol. 68, pp. 586-596.
https://doi.org/10.1016/j.compedu.2012.03.002 - Wu, J., Ma, L. and Hu, X. (2017), “Delving deeper into convolutional neural networks for camera relocalization”, 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 5644-5651.
https://doi.org/10.1109/ICRA.2017.7989663 - Azuma, R.T. (1997), “A survey of augmented reality”, Teleop. Virt. Environ., Vol. 6, no. 4, pp. 355-385.
https://doi.org/10.1162/pres.1997.6.4.355 - Billinghurst, M., Clark, A., Lee, G., et al. (2015), “A survey of augmented reality”, Found. Trends® Human–Comput. Interact., Vol. 8, no. 2–3, pp. 73-272.
https://doi.org/10.1561/1100000049 - Furht, B. (2011), Handbook of Augmented Reality, Springer Science & Business Media.
https://doi.org/10.1007/978-1-4614-0064-6 - Azuma, R.T., Baillot, Y., Behringer, R., Feiner, S., Julier, S. and MacIntyre, B. (2001), “Recent advances in augmented reality”, IEEE Graph. Appl., Vol. 21, no. 6, pp. 34-47.
https://doi.org/10.1109/38.963459 - Carmigniani, J., Furht, B., Anisetti, M., Ceravolo, P., Damiani, E. and Ivkovic, M. (2011), “Augmented reality technologies, systems and applications”, Tools Appl., Vol. 51, no. 1, pp. 341-377.
https://doi.org/10.1007/s11042-010-0660-6 - Amin, D. and Govilkar, S. (2015), “Comparative study of augmented reality SDKs”, J. Comput. Sci. Appl., Vol. 5, no. 1, pp. 11-26.
https://doi.org/10.5121/ijcsa.2015.5102 - Kim, H., Matuszka, T., Kim, J.-I., Kim, J. and Woo, W. (2017), “Ontology-based mobile augmented reality in cultural heritage sites: information modeling and user study”, Multimedia Tools Appl., Vol. 76, no. 24, pp. 26001-26029.
https://doi.org/10.1007/s11042-017-4868-6 - Nowacki, P. and Woda, M. (2019), “Capabilities of ARCore and ARKit platforms for AR/VR applications”, International Conference on Dependability and Complex Systems, pp. 358-370.
https://doi.org/10.1007/978-3-030-19501-4_36 - Milgram, P. and Kishino, F. (1994), “A taxonomy of mixed reality visual displays”, Inf. Syst., Vol. 77, no. 12, pp. 1321-1329.
- LeCun, Y., Bengio, Y. and Hinton, G. (2015), “Deep learning”, Nature, 521, no. 7553, pp. 436-444.
https://doi.org/10.1038/nature14539 - Akgul, O., Penekli, H. and Genc, Y. (2016), “Applying deep learning in augmented reality tracking”, 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), pp. 47-54.
https://doi.org/10.1109/SITIS.2016.17 - Rublee, E., Rabaud, V., Konolige, K. and Bradski, G.R. (2011), “ORB: An efficient alternative to SIFT or SURF”, IEEE International Conference on Computer Vision (ICCV), pp. 22564-2571.
https://doi.org/10.1109/ICCV.2011.6126544 - Limmer, M., Forster, J., Baudach, D., Schüle, F., Schweiger, R. and Lensch, H.P. (2016), “Robust deep-learning-based road-prediction for augmented reality navigation systems at night”, 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), pp. 1888-1895.
https://doi.org/10.1109/ITSC.2016.7795862 - Schüle, F., Schweiger, R. and Dietmayer, K. (2013), “Augmenting night vision video images with longer distance road course information”, 2013 IEEE Intelligent Vehicles Symposium, pp. 1233-1238.
https://doi.org/10.1109/IVS.2013.6629635 - Risack, R., Klausmann, P., Krüger, W. and Enkelmann, W. (1998), “Robust lane recognition embedded in a real-time driver assistance system”, IEEE, pp. 35-40.
- Farabet, C., Couprie, C., Najman, L. and LeCun, Y. (2012), “Learning hierarchical features for scene labeling”, IEEE Trans. Pattern Anal. Mach. Intell., 35, no. 8, pp. 1915- 1929.
https://doi.org/10.1109/TPAMI.2012.231 - Schröder, M. and Ritter, H. (2017), Deep learning for action recognition in augmented reality assistance systems, ACM SIGGRAPH 2017 Posters.
https://doi.org/10.1145/3102163.3102191 - Long, J., Shelhamer, E. and Darrell, T. (2015), “Fully convolutional networks for semantic segmentation”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440.
https://doi.org/10.1109/CVPR.2015.7298965 - Abdi, L. and Meddeb, A. (2017), “Deep learning traffic sign detection, recognition and augmentation”, Proceedings of the Symposium on Applied Computing, pp. 131-136.
https://doi.org/10.1145/3019612.3019643 - CireşAn, D., Meier, U., Masci, J. and Schmidhuber, J. (2012), “Multi-column deep neural network for traffic sign classification”, Neural Netw., pp. 333-338.
https://doi.org/10.1016/j.neunet.2012.02.023 - Stallkamp, J., Schlipsing, M., Salmen, J. and Igel, C. (2012), “Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition”, Neural Netw., Vol. 32, pp. 323-332.
https://doi.org/10.1016/j.neunet.2012.02.016 - Sermanet, P. and LeCun, Y. (2011), “Traffic sign recognition with multi-scale convolutional networks”, International Joint Conference on Neural Networks (IJCNN), pp. 2809-2813.
https://doi.org/10.1109/IJCNN.2011.6033589 - Stallkamp, J., Schlipsing, M., Salmen, J. and Igel, C. (2011), “The german traffic sign recognition benchmark: a multi-class classification competition”, International Joint Conference on Neural Networks (IJCNN), pp. 1453-1460.
https://doi.org/10.1109/IJCNN.2011.6033395 - Rao, J., Qiao, Y., Ren, F., Wang, J. and Du, Q. (2017) “A mobile outdoor augmented reality method combining deep learning object detection and spatial relationships for geovisualization”, Sensors, Vol. 17, no. 9, pp. 1951-1977.
https://doi.org/10.3390/s17091951 - Wang, R., Lu, H., Xiao, J., Li, Y. and Qiu, Q. (2018). “The design of an augmented reality system for urban search and rescue”, 2018 IEEE International Conference on Intelligence and Safety for Robotics (ISR), pp. 267-272.
https://doi.org/10.1109/IISR.2018.8535823 - Caelles, S., Maninis, K.-K., Pont-Tuset, J., Leal-Taixé, L., Cremers, D. and Van Gool, L. (2017), “One-shot video object segmentation”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 221-230.
https://doi.org/10.1109/CVPR.2017.565 - Aliprantis, J., Kalatha, E., Konstantakis, M., Michalakis, K. and Caridakis, G. (2018), “Linked open data as universal markers for mobile augmented reality applications in cultural heritage”, Digital Cultural Heritage, pp. 79-90.
https://doi.org/10.1007/978-3-319-75826-8_7 - Englert, M., Klomann, M., Weber, K., Grimm, P. and Jung, Y. (2019), “Enhancing the AR experience with machine learning services”, The 24th International Conference on 3D Web Technology, pp. 1-9.
https://doi.org/10.1145/3329714.3338134 - Zhou, F., Duh, H.B.-L. and Billinghurst, M. (2008), “Trends in augmented reality tracking, interaction and display: a review of ten years of ISMAR”, Proceedings of the 7th IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 193-202.
- Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012), “Imagenet classification with deep convolutional neural networks”, Advances in Neural Information Processing Systems, pp. 1097-1105.
- Guenter, B., Finch, M., Drucker, S., Tan, D. and Snyder, J. (2012), “Foveated 3D graphics”, ACM Trans. Graph., Vol. 31, no. 6, pp. 1-10.
https://doi.org/10.1145/2366145.2366183