Gpdb:Out of Memory at Segment

GPDB:Out of memory at segment

Do we need to increase the vmprotect limit? if Yes, then by which amount should we increase it?

There is a nice calculator on setting gp_vmem_protect_limit on Greenplum.org. The setting depends on how much memory, swap, and segments per host you have.

http://greenplum.org/calc/

You can be getting OOM errors for several reasons.

Bad query
Bad table distribution (skew)
Bad settings (like gp_vmem_protect_limit)
Not enough resources (RAM)

How to handle concurrency at gpdb?

More RAM, less segments per host, and workload management to limit the number of concurrent queries running.

How much swap we need to add here when we are already running with 30 GB RAM. currently, we have added 15GB swap here? is that ok ?

Only 30GB of RAM? That is pretty small. You can add more swap but it will slow down the queries compared to real RAM. I wouldn't use much more than 8GB of swap.

I recommend using 256GB of RAM or more especially if you are worried about concurrency.

What is the query to identify host connection with Greenplum database

select * from pg_stat_activity;

statement_mem seems to limit the node memory instead of the segment memory

I incorrectly stated that statement_mem is on a per host and that is not the case. This link is talking about the memory on a segment level:
http://gpdb.docs.pivotal.io/4370/guc_config-statement_mem.html#statement_mem

With the default of "eager_free" gp_resqueue_memory_policy, memory gets re-used so the aggregate amount of memory used may look low for a particular query execution. If you change it to "auto" where the memory isn't re-used, the memory usage is more noticeable.

Run an "explain analyze" of your query and see the slices that are used. With eager_free, the memory gets re-used so you may only have a single slice wanting more memory than available such as this one:

(slice18) * Executor memory: 10399K bytes avg x 2 workers, 10399K bytes max (seg0).  Work_mem: 8192K bytes max, 13088K bytes wanted.

And for your question on how to manage the resources, most people don't change the default values. A query that spills to disk is usually an indication that the query needs to be revised or the data model needs some work.

Greenplum gp_vmem_protect_limit configuration

There is a calculator for it!

http://greenplum.org/calc/

You should also add a swap file or disk. It is pretty easy to do in Amazon too. I would add at least a 4GB swap file to each host when you have 240GB of RAM.

Gpdb:Out of Memory at Segment