Database engines are not black box technologies with infiinte capapcity. This starts becoming very clear when one reads about the struggles FaceBook and WikiPedia went through in the ~2008 era to keep up with the demand for their software.
Network applications can take little for granted if they foresee massive growth in their horizons. So let’s discuss aspect of the Relational Database that may be overlooked by some.
Data retrieval.
Without getting into the technicals too much. When MySQL retrieves records, it treats the retrieved data differently based on the type of data at hand.
Text/blob data types typically aren’t kept in memory for-example; so for every read operation they would be fetched from the hard-disk directly; which takes longer than retrieving from RAM.
This can be important when you are running a complicated query that can benefit from storing sub-query results in memory while a secondary query attempts to further refine the results; the server wouldn’t have to head back to the hard-disk to fetch the actual text/blob data for every refinement.
Data retrieval in a complicated query as described above is further amplified when you consider having the database spread over several sharded servers.
When examining the full extent of your database schema for optimizations; it might be worth the trouble to consider data retrieval strategies.
Have you considered storing compiled records in a file-storage format as a form of query-caching? Imagine the possibilities! Implemented correctly, you can lift a slice of the load from your database servers as dry and simple data retrievals that require no operations on text/blob fields will be relieved of having to fetch data from the hard-disk every time.
Of course this only helps in certain types of applications and requires a careful examination of your application’s main points of strain. However, when you need to grow; understanding their are more possibilities for optimization should be a stress reliever.
By Mustafa Ghayyur,
September 14, 2018