The Database Chronicles

Machine Learning (ML) has been around for decades. Lots of applications of it can be provided. The idea here is to feed the agent with the ability to learn by itself. We don’t program all possible if-then-else cases but rather provide samples and the entity learns how to behave from this point on (supervised learning), or drop the entity in to an environment and then punish or reward it based on its behavior and it becomes smarter as it hits its head against the wall oftener and oftener (reinforcement leaning), or lastly given collection of data, the agent thinks about properly classifying it or splitting them into possible groups (unsupervised learning) that share similar properties. Take a look at this to see how a robot learns to flip pancakes after 50 trials. It does impress.

Database Management Systems (DBMS) have some candidate modules for integrating with ML. Query Optimizer can serve as an example of it. A query may have multiple plans for execution. The job of the optimizer is to pick up a plan with the least cost. Cost may include many metrics, such as number of I/O or number of rows processed (for In-Memory DBMS). We can treat the optimizer as agent/robot. And the system as the environment. Cost can serve as reward. The lesser the cost, the more we reward the optimizer. So, even though this time a particular cost was picked up, the optimizer will try to improve it the next time. In this way, the optimizer can learn from its behavior itself rather than being programmed for most of possible cases.

Harder to do than said. In theory, this would work. In practice….it needs a careful design and lots of experimentation. The hardest part would be to map optimizer’s behavior onto reinforcement learning paradigm. Environments are so many and very different.

Computer science associate professor in MIT Tim Kraska briefly speaks about how DBAs can be released from index management in databases if machine learning used. Once data distribution changes, it could be worth to reconsider indexes. Index management module can learn to pick up columns to be indexed as data in the database changes over time.

Recently, group of researchers and students at Carnegie Mellon University have developed a tool named OtterTune that aims at simplifying DBMS deployment. The tool has a repository that keeps data from historical tuning sessions. This experience is used then by the ML engine to develop new, appropriate recommendations for new configuration depending on application objective. In the corresponding paper, authors conduct experiments on three DBMSs: MySQL, Postgres and Vector.

Oracle introduced some adaptive features in 12c which is a step towards intelligent tuning. One of them is adaptive execution plan feature which, briefly put, works as follows. An SQL statement is executed with an initially generated plan and candidate plans are kept in the cursor of the query. Say, for a join in the query Nested Loop (NL) is the initial one. So, when the execution reaches this join point, rows are buffered and compared with initial estimates. If comparison reaches certain threshold, then the NL is replaced with Hash Join which would be a smarter decision for this case. Thus, a join method switch happens on the fly, at run time. Execution considers feedback. Subsequent executions of the same SQL query then start using this plan from then on.

Database Management Systems leave a large room for application of machine learning methods. Candidates can be query optimization, database recovery and checkpointing, memory management and instance tuning. Along with database knowledge, ML skills for future DBAs might be a plus.

Component	Waits	Time(s)	Avg wait (ms)	% DB time	Wait Class
latch free	235,598	24,906	106	309.52	Other
DB CPU		1,039		12.91
db file sequential read	55,744	418	7	5.19	User I/O
control file sequential read	160,881	166	1	2.07	System I/O
log file sync	114	1	13	0.02	Commit

Component

Waits

Time(s)

Avg wait (ms)

% DB time

Wait Class

latch free

235,598

24,906

106

309.52

Other

DB CPU

1,039

12.91

db file sequential read

55,744

418

5.19

User I/O

control file sequential read

160,881

166

2.07

System I/O

log file sync

114

0.02

Commit

Latch Name	Get Requests	Misses	Sleeps	Spin Gets
Result Cache: RC Latch	1,862,968	288,200	273,530	61,310
space background task latch	7,252	2,098	72	2,035
qmn task queue latch	1,010	136	8	128
cache buffers chains	4,119,113	42	6	33
managed standby latch	96	6	1	5
call allocation	3,344	2	1	1
resmgr:active threads	3,572	1	1	0

Latch Name

Get Requests

Misses

Sleeps

Spin Gets

Result Cache: RC Latch

1,862,968

288,200

273,530

61,310

space background task latch

7,252

2,098

2,035

qmn task queue latch

1,010

136

128

cache buffers chains

4,119,113

managed standby latch

call allocation

3,344

resmgr:active threads

3,572

The Database Chronicles

Intelligent Database Management Systems

Latching and Result Cache – A Quick Troubleshooting Case