Speeding up an UPDATE in MySQL
@WilsonHauck This project has been going through a lot of optimizations. There are several tables to be migrated and many more millions of records, I have experimented with many things like different buffer sizes, using MEMORY engines, etc. I have benchmarks in place and they didn't work for my use case.
@RickJames This particular UPDATE was the last statement that resisted optimization. Indeed, working on small ranges has been the key, I got a 5x speed up in my test environment for this particular statement. The compromise has been ranges of 5K and a pool of 20 threads (there are other threads doing other work in parallel). The test machine has 8 cores, but the production machine has 48, so I expect the speed up to be even greater.
I would like to understand the lock errors I was getting when ranges were on the order of hundreds of thousands (I mean, to actually know what they were rather than conjecture, and so to understand why they are not present in small ranges), and also to understand why I need to hand-code a more performant version of the update.
But that is just to better understand the details, this 5x speed up is incredible and enough for my purposes.
BTW, I believe an I/O bound task can precisely use more threads than cores, because you have wait times in the CPU that other threads can leverage. It is for CPU-bound tasks that you won't squeeze more performance with more threads.
@Solarflare since the multithread approach is what I was looking for, I didn't experiement with the STRAIGHT JOIN, but in the new approach cardinalities are reversed, and MySQL starts with invoice now. Perhaps we got an extra boost also from starting there, as per your remark.
Improve Update Statement Performance
Try something like...
WITH X AS (
SELECT Vendor , SUM(MonthlyAmt) AS TotalAmtCalculated
FROM VendorTable
GROUP BY Vendor
)
UPDATE t
SET t.TotalAmt = x.TotalAmtCalculated
FROM VendorTable t
INNER JOIN x ON t.Vendor = x.Vendor
Your query is slow because your inner select s getting executed for each row returned by the outer update query.
Also check if there are any indexes on the table with the TotalAmt
column in them, those indexes will also slow down your updates.
How to improve massive UPDATE from SELECT performance?
For your query, you want an index on:
staged_items(item_id, upgrade_version, operation)
I am also thinking that you could rewrite the outer where
clause as:
WHERE t1.id IN (SELECT t2.item_id
FROM staged_items t2
WHERE t2.upgrade_version = 1234 AND t2.operation = 'modification'
)
Then, you want indexes on staged_items(upgrade_version, operation, item_id)
and items(id)
. Note that the order of the keys in the index is important and you still want the first index for the correlated subquery to get the values.
Improving performance of UPDATE query with subqueries
You can use Joins instead of subqueries:
UPDATE T1
SET T1.coordinatesChecked = 1,
T1.FromLatitude = T2.Latitude,
T1.FromLongitude = T2.Longitude
T1.ToLatitude = T3.Latitude,
T1.ToLongitude = T3.Longitude
FROM LoadsAvail AS T1 LEFT JOIN ZipCodes AS T2 ON T1.FromCity = T2.CityName AND T1.FromState = T2.ProvinceAbbr
LEFT JOIN ZipCodes AS T3 ON T1.toCity = T3.CityName AND T1.toState = T3.ProvinceAbbr
WHERE T1.coordinatesChecked = 0
Related Topics
Oracle Delete Rows Matching on Multiple Values
How to Delete Duplicates from a Database Table Based on a Certain Field
SQL Server Max Statement Returns Multiple Results
Select Newest Record Group by Username in SQL Server 2008
Why Would Year Fail with a Conversion Error from a Date
Percentage from Total Sum After Group by SQL Server
Select Something That Has More/Less Than X Character
Ukrainian Character Change to Question Mark When Insert to Table
Should I Set Max Pool Size in Database Connection String? What Happens If I Don'T
Syntax Error in Dynamic SQL in Pl/Pgsql Function
Returning the Distinct First Character of a Field (Mysql)
Is It a Bad Idea to Use Guids as Primary Keys in Ms SQL
Multiple Counts Within a Single SQL Query
How to Get Rid of #Temp Tables from the Query
Athena Presto - Multiple Columns from Long to Wide
How to Drop Multiple Columns with a Single Alter Table Statement in SQL Server