SQL Explain Plan: What Is Materialize

SQL explain plan: what is Materialize?

A materialize node means the output of whatever is below it in the tree (which can be a scan, or a full set of joins or something like that) is materalized into memory before the upper node is executed. This is usually done when the outer node needs a source that it can re-scan for some reason or other.

So in your case, the planner is determining that the result of a scan on one of your tables will fit in memory, and it till make it possible to choose an upper join operation that requires rescans while still being cheaper.

In a PostgreSQL query plan, what is the difference between Materialize and Hash?

After consulting the source I see that Materialize is basically just a contiguous cache of rows (a tuplestore) that is constantly rewound and iterated again for each of the outer rows.

Do CTE's materialize a computation?

Generally, SQL Server doesn't materialise CTE, as opposed to, say, Postgres.

You can confirm it by examining the actual execution plan for your query.
I'd recommend SentryOne Plan Explorer, it is a great tool.

https://www.sentryone.com/plan-explorer

I expect to see 7 calls to replace in your example.


Well, I miscalculated. The real answer is:

you should check the actual execution plan.

In your example it looks like this:

Filer

9 calls to replace in Filter operator.

Compute scalar

plus 3 calls in Compute Scalar operator.

12 in total.


So, we confirmed that SQL Server didn't materialise CTE in this example. (It was SQL Server 2017 Developer Edition)

Some further reading:

What's the difference between a CTE and a Temp Table?

Is there a performance difference between CTE , Sub-Query, Temporary Table or Table Variable?

Use of With Clause in SQL Server

There is a suggestion for Microsoft to add a Materialize hint for CTE, similar to what Oracle offers: T-SQL Common Table Expression "Materialize" Option

Postgres Materialize causes poor performance in delete query

My guess is that at rows=524289 the memory buffer is filled up, so the subquery has to be materialized on the disk. Hence the dramatic increase in the time needed.

Here you can read more about configuring the memory buffers: http://www.postgresql.org/docs/9.1/static/runtime-config-resource.html

If you play with work_mem you will see the difference in the query behavior.

However using join in the subquery is much better way to speed the query, since you are limiting the number of the rows at the source itself vs simply selecting first XYZ rows and then performing checks.

What does it mean MATERIALIZED in the select_type column in the result of MySQL Explain closure?

It means that the result of a subquery was saved as a virtual temporary table instead of executing it for each row. This was introduced in MySQL 5.7 and speeds up some queries that were super slow before due to the fact the result of their subquery parts wasn't cached

Materialized View Performance of Exists vs In

This looks like a known bug. If you have access to My Oracle Support look at Slow Create/Refresh of Materialized View Based on NOT IN Definition Query (Doc ID 1591851.1), or less usefully if you don't, a summary of the problem is available.

The contents of the MOS version can't be reproduced here of course, but suffice to say that the only workaround is what you're already doing with not exists. It's fixed in 12c, which doesn't help you much.

Slow query when joining with a materialized view

activity_status_view that is essentially a subset of the rows of activity

That subset of the rows it chooses has excluded all the ones that actually match the join condition. So the LIMIT never kicks in and lets the query terminate early. Given that fact, the better plan would be to do a hash join, or use an index on activity_status_view.id, or on task.args ->> 'activityId' (you could try creating the last of those), but if the statistics are way off it might not realize that.

Because the distribution of values of ->>'activityId' is not visible to the planner (being inside a JSONB), it can't know how often those values might intersect. Creating the expression index can help it figure that out, (run ANALYZE on the table after creating the index), so it might use the statistics from the index to solve the planning problem, without actually using the index in the plan.

Is it just a coincidence that the MV query excludes all the matching rows, or is that by design?

Did you VACUUM ANALYZE the materialized view after refreshing it?

It is unlikely that this has anything to do with the fact that it is a materialized view. If you made it be a free-standing table (CREATE TABLE whatever AS SELECT...) or a regular nonmaterialized view, you would probably have the same problem.



Related Topics



Leave a reply



Submit