Optimization of Search Queries by Introducing the Concept of Weight

Anastasia Mironova

Abstract


With the rapid development of information technology, the requirements for information and the speed of its processing are increasing just as quickly. This article proposes a method for improving the efficiency of element searches in wide tables with a large number of columns, which is also applicable to the task of table joins. The method is based on specific indexing of elements, using a concept similar to the notion of space norm — weight. This allows the formation of an equivalence relation on the set of table elements, while simultaneously eliminating certain limitations associated with indexing by elements, such as the relationship between the order of characteristics or restrictions on the types of data being processed in the tables. Additionally, the article discusses the prediction of the join result size based on metadata without actually performing the join. This concept enables the construction of join sequences in an optimal or near-optimal manner, significantly improving the efficiency of the join operation by reducing the size of the tables being processed.

Full Text:

PDF (Russian)

References


Abdel-Basset M. et al. An improved nature inspired meta-heuristic algorithm for 1-D bin packing problems //Personal and Ubiquitous Computing. – 2018. – Т. 22. – №. 5-6. – С. 1117-1132.

Chamoso, Pablo, et al. "Social computing for image matching." PloS one 13.5 (2018): e0197576.

Das S. et al. Automatically indexing millions of databases in microsoft azure sql database //Proceedings of the 2019 International Conference on Management of Data. – 2019. – С. 666-679.

Kirikova A., Mironov A., Munerman V. The Method of Composition Hash-functions for Optimize a Task of Searching Images in Dataset //2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). – IEEE, 2020. – С. 1983-1986.

Dodonov A. et al. Method of Parallel Information Object Search in Unified Information Spaces //International Journal of Computer Network and Information Security (IJCNIS). – 2021. – Т. 13. – №. 4. – С. 1-13.

Gorokhovatskyi V. A., Gorokhovatskiy A. V., Peredrii Y. О. Hashing of structural descriptions at building of the class image descriptor, computing of relevance and classification of the visual objects //Telecommunications and Radio Engineering. – 2018. – Т. 77. – №. 13.

Graefe G. et al. Modern B-tree techniques //Foundations and Trends® in Databases. – 2011. – Т. 3. – №. 4. – С. 203-402.

Haynes, David, et al. "High performance analysis of big spatial data." 2015 IEEE International Conference on Big Data (Big Data). IEEE, 2015.

Munerman V.I. The experience of massive data processing in the cloud using windows azure (as an example) High availability systems. - 2014. - V. 10. - №. 2. - p. 8-13.

Iljin P. L., Munerman V. J. Recursive computation of the multidimensional matrix determinant. Systems of computerized mathematics and their appendices: XX International Scientific Conference. Smolensk: SmolSU publishing. 2019. Vol. 1, Issue 20. pp. 162-166. (In Russ).

Kirikova A., Mironov A. Using Metadata-indexing to Improve the Efficiency of Complex Operations //2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus). – IEEE, 2021. – С. 2124-2127.

Levin N. A., Munerman V. I. Models of big data processing in massively parallel systems //Системы высокой доступности. – 2013. – Т. 9. – №. 1. – С. 035-043.

Lomet D. The evolution of effective b-tree: Page organization and techniques: A personal account //ACM SIGMOD Record. – 2001. – Т. 30. – №. 3. – С. 64-69.

Lvovich I. et al. Modeling and optimization of processing large data arrays in information systems //2021 International Conference on Information Technology and Nanotechnology (ITNT). – IEEE, 2021. – С. 1-5.

Monga, Vishal, and Brian L. Evans. "Perceptual image hashing via feature points: performance evaluation and tradeoffs." IEEE Transactions on Image Processing 15.11 (2006): 3452-3465.

Munerman V., Munerman D. Realization of Distributed Data Processing on the Basis of Container Technology //2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). – IEEE, 2019. – С. 1740-1744.

Munerman V., Munerman D., Samoilova T. The Heuristic Algorithm For Symmetric Horizontal Data Distribution //2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus). – IEEE, 2021. – С. 2161-2165.

Pushpa Rani Suri, Sudesh Rani, "A New Classification for Architecture of Parallel Databases", Information Technology Journal, vol. 7, pp. 983. (2008).

Pyurova T. A., Skvortsov S. V. CUDA technology and parallel computing On GPU //Informatics and applied mathimatic: interuniversity compendium of treatises, no. 21, pp. 163-166. 2015. (In Russ).

Sridhar R. et al. Optimization of heterogeneous Bin packing using adaptive genetic algorithm //IOP Conference Series: Materials Science and Engineering. – IOP Publishing, 2017. – Т. 183. – №. 1. – С. 012026. 16.

Syrotkina O. et al. Mathematical Methods for optimizing Big Data Processing //2020 10th International Conference on Advanced Computer Information Technologies (ACIT). – IEEE, 2020. – С. 170-176.

Wajszczyk B., Gruszka I. M. Analysis of possibilities to increase the efficiency of the relative database management system using the methods of parallel processing //Radioelectronic Systems Conference 2019. – SPIE, 2020. – Т. 11442. – С. 385-398.

Zakharov V. et al. Architecture of Software-Hardware Complex for Searching Images in Database //2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). – IEEE, 2019. – С. 1735-1739.

Zobel J., Moffat A., Sacks-Davis R. An efficient indexing technique for full-text database systems //PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES. – INSTITUTE OF ELECTRICAL & ELECTRONICS ENGINEERS (IEEE), 1992. – С. 352-352.


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность ИТ конгресс СНЭ

ISSN: 2307-8162