CURRENT STATE AND PROSPECTS OF INCREASING THE FUNCTIONALITY OF AUGMENTED REALITY USING NEURAL NETWORKS

I.V. Zhabokrytskyi

Èlektron. model. 2022, 44(5):73-89

https://doi.org/10.15407/emodel.44.05.073

ABSTRACT

The dynamics of the development of modern society and the rapid breakthrough of the technological component led to the need to interact with fast-changing and client-oriented information in real time. This need is met through the use of augmented reality technology, which allows users to interact in real time with both the real physical and virtual digital worlds. The rapid digitization of human existence has provoked an exponential increase in the amount of existing data, thereby posing new challenges to the scientific community. At the same time, the technology of deep learning, which is successfully applied in various fields, has a rather large potential. The purpose of this study is to present the potential of combining technologies of augmented reality and deep learning, their mutual improvement and further application in the development of modern highly intelligent programs. The work briefly provides an understanding of the concepts of augmented and mixed reality and also describes the technology of deep learning. Based on the literature review, relevant studies on the development of augmented reality applications and systems using these technologies are presented and analyzed. After discussing how the integration of deep learning into augmented reality increases the quality and efficiency of applications and facilitates the daily life of their users, conclusions and suggestions for future research are provided.

KEYWORDS

augmented reality; machine learning; deep learning; neural networks, virtual reality.

REFERENCES

Dunleavy, M. (2014), “Design principles for augmented reality learning”, TechTrends, Vol. 58, 1, pp. 28-34.
https://doi.org/10.1007/s11528-013-0717-2
Enyedy, N., Danish, J.A. and DeLiema, D. (2015), “Constructing liminal blends in a collaborative augmented-reality learning environment”, Int. J. Comput.-Support. Collaborat. Learn, Vol. 10, no. 1, pp. 7-34.
https://doi.org/10.1007/s11412-015-9207-1
Lee, K. (2012), “Augmented reality in education and training”, TechTrends, Vol. 56, no. 2, pp. 13-21.
https://doi.org/10.1007/s11528-012-0559-3
Chen, P., Liu, X., Cheng, W. and Huang, R. (2017), “A review of using augmented reality in education from 2011 to 2016”, Innovations in smart learning, pp. 13-18.
https://doi.org/10.1007/978-981-10-2419-1_2
Di Serio, Á., Ibáñez, M.B. and Kloos, C.D. (2013), “Impact of an augmented reality system on students’ motivation for a visual art course”, Comput. Educ., Vol. 68, pp. 586-596.
https://doi.org/10.1016/j.compedu.2012.03.002
Wu, J., Ma, L. and Hu, X. (2017), “Delving deeper into convolutional neural networks for camera relocalization”, 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 5644-5651.
https://doi.org/10.1109/ICRA.2017.7989663
Azuma, R.T. (1997), “A survey of augmented reality”, Teleop. Virt. Environ., Vol. 6, no. 4, pp. 355-385.
https://doi.org/10.1162/pres.1997.6.4.355
Billinghurst, M., Clark, A., Lee, G., et al. (2015), “A survey of augmented reality”, Found. Trends® Human–Comput. Interact., Vol. 8, no. 2–3, pp. 73-272.
https://doi.org/10.1561/1100000049
Furht, B. (2011), Handbook of Augmented Reality, Springer Science & Business Media.
https://doi.org/10.1007/978-1-4614-0064-6
Azuma, R.T., Baillot, Y., Behringer, R., Feiner, S., Julier, S. and MacIntyre, B. (2001), “Recent advances in augmented reality”, IEEE Graph. Appl., Vol. 21, no. 6, pp. 34-47.
https://doi.org/10.1109/38.963459
Carmigniani, J., Furht, B., Anisetti, M., Ceravolo, P., Damiani, E. and Ivkovic, M. (2011), “Augmented reality technologies, systems and applications”, Tools Appl., Vol. 51, no. 1, pp. 341-377.
https://doi.org/10.1007/s11042-010-0660-6
Amin, D. and Govilkar, S. (2015), “Comparative study of augmented reality SDKs”, J. Comput. Sci. Appl., Vol. 5, no. 1, pp. 11-26.
https://doi.org/10.5121/ijcsa.2015.5102
Kim, H., Matuszka, T., Kim, J.-I., Kim, J. and Woo, W. (2017), “Ontology-based mobile augmented reality in cultural heritage sites: information modeling and user study”, Multimedia Tools Appl., Vol. 76, no. 24, pp. 26001-26029.
https://doi.org/10.1007/s11042-017-4868-6
Nowacki, P. and Woda, M. (2019), “Capabilities of ARCore and ARKit platforms for AR/VR applications”, International Conference on Dependability and Complex Systems, pp. 358-370.
https://doi.org/10.1007/978-3-030-19501-4_36
Milgram, P. and Kishino, F. (1994), “A taxonomy of mixed reality visual displays”, Inf. Syst., Vol. 77, no. 12, pp. 1321-1329.
LeCun, Y., Bengio, Y. and Hinton, G. (2015), “Deep learning”, Nature, 521, no. 7553, pp. 436-444.
https://doi.org/10.1038/nature14539
Akgul, O., Penekli, H. and Genc, Y. (2016), “Applying deep learning in augmented reality tracking”, 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), pp. 47-54.
https://doi.org/10.1109/SITIS.2016.17
Rublee, E., Rabaud, V., Konolige, K. and Bradski, G.R. (2011), “ORB: An efficient alternative to SIFT or SURF”, IEEE International Conference on Computer Vision (ICCV), pp. 22564-2571.
https://doi.org/10.1109/ICCV.2011.6126544
Limmer, M., Forster, J., Baudach, D., Schüle, F., Schweiger, R. and Lensch, H.P. (2016), “Robust deep-learning-based road-prediction for augmented reality navigation systems at night”, 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), pp. 1888-1895.
https://doi.org/10.1109/ITSC.2016.7795862
Schüle, F., Schweiger, R. and Dietmayer, K. (2013), “Augmenting night vision video images with longer distance road course information”, 2013 IEEE Intelligent Vehicles Symposium, pp. 1233-1238.
https://doi.org/10.1109/IVS.2013.6629635
Risack, R., Klausmann, P., Krüger, W. and Enkelmann, W. (1998), “Robust lane recognition embedded in a real-time driver assistance system”, IEEE, pp. 35-40.
Farabet, C., Couprie, C., Najman, L. and LeCun, Y. (2012), “Learning hierarchical features for scene labeling”, IEEE Trans. Pattern Anal. Mach. Intell., 35, no. 8, pp. 1915- 1929.
https://doi.org/10.1109/TPAMI.2012.231
Schröder, M. and Ritter, H. (2017), Deep learning for action recognition in augmented reality assistance systems, ACM SIGGRAPH 2017 Posters.
https://doi.org/10.1145/3102163.3102191
Long, J., Shelhamer, E. and Darrell, T. (2015), “Fully convolutional networks for semantic segmentation”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440.
https://doi.org/10.1109/CVPR.2015.7298965
Abdi, L. and Meddeb, A. (2017), “Deep learning traffic sign detection, recognition and augmentation”, Proceedings of the Symposium on Applied Computing, pp. 131-136.
https://doi.org/10.1145/3019612.3019643
CireşAn, D., Meier, U., Masci, J. and Schmidhuber, J. (2012), “Multi-column deep neural network for traffic sign classification”, Neural Netw., pp. 333-338.
https://doi.org/10.1016/j.neunet.2012.02.023
Stallkamp, J., Schlipsing, M., Salmen, J. and Igel, C. (2012), “Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition”, Neural Netw., Vol. 32, pp. 323-332.
https://doi.org/10.1016/j.neunet.2012.02.016
Sermanet, P. and LeCun, Y. (2011), “Traffic sign recognition with multi-scale convolutional networks”, International Joint Conference on Neural Networks (IJCNN), pp. 2809-2813.
https://doi.org/10.1109/IJCNN.2011.6033589
Stallkamp, J., Schlipsing, M., Salmen, J. and Igel, C. (2011), “The german traffic sign recognition benchmark: a multi-class classification competition”, International Joint Conference on Neural Networks (IJCNN), pp. 1453-1460.
https://doi.org/10.1109/IJCNN.2011.6033395
Rao, J., Qiao, Y., Ren, F., Wang, J. and Du, Q. (2017) “A mobile outdoor augmented reality method combining deep learning object detection and spatial relationships for geovisualization”, Sensors, Vol. 17, no. 9, pp. 1951-1977.
https://doi.org/10.3390/s17091951
Wang, R., Lu, H., Xiao, J., Li, Y. and Qiu, Q. (2018). “The design of an augmented reality system for urban search and rescue”, 2018 IEEE International Conference on Intelligence and Safety for Robotics (ISR), pp. 267-272.
https://doi.org/10.1109/IISR.2018.8535823
Caelles, S., Maninis, K.-K., Pont-Tuset, J., Leal-Taixé, L., Cremers, D. and Van Gool, L. (2017), “One-shot video object segmentation”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 221-230.
https://doi.org/10.1109/CVPR.2017.565
Aliprantis, J., Kalatha, E., Konstantakis, M., Michalakis, K. and Caridakis, G. (2018), “Linked open data as universal markers for mobile augmented reality applications in cultural heritage”, Digital Cultural Heritage, pp. 79-90.
https://doi.org/10.1007/978-3-319-75826-8_7
Englert, M., Klomann, M., Weber, K., Grimm, P. and Jung, Y. (2019), “Enhancing the AR experience with machine learning services”, The 24th International Conference on 3D Web Technology, pp. 1-9.
https://doi.org/10.1145/3329714.3338134
Zhou, F., Duh, H.B.-L. and Billinghurst, M. (2008), “Trends in augmented reality tracking, interaction and display: a review of ten years of ISMAR”, Proceedings of the 7th IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 193-202.
Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012), “Imagenet classification with deep convolutional neural networks”, Advances in Neural Information Processing Systems, pp. 1097-1105.
Guenter, B., Finch, M., Drucker, S., Tan, D. and Snyder, J. (2012), “Foveated 3D graphics”, ACM Trans. Graph., Vol. 31, no. 6, pp. 1-10.
https://doi.org/10.1145/2366145.2366183

Full text: PDF