Details about HANA LOBs

last updated: 2022-06-22 16:30 CEST

Ok, large objects aka LOBs (no not Line of Business) are nothing special and well known from other database systems.
LOBs are unstructured data like a picture, a PDF or a message XML content. The shared properties of this kind of data is that they can be quite large.
For a in memory database those data can be a ressource killer. Because of this, not all LOB data is loaded into HANA. But there is more to report, because they are handled differently in depency to the HANA release and the running application. This must be taken into account in order to achieve max. performance.

Content:

  1. types of LOBs
  2. VARBINARY vs. LOBs
  3. garbage collection
  4. Parametrization
  5. Mini Checks
  6. Frequently occuring known issues
  7. Threads
  8. SQL checks
  9. Summary

Sources for details
2400005 – FAQ: SAP HANA Persistence

2220627 – FAQ: SAP HANA LOBs

1994962 – How-To: Activation of Hybrid LOBs in SAP HANA

2169283 – FAQ: SAP HANA Garbage Collection


LOBs can be stored as CS and RS table. Here we have different types of LOBs:

  1. hybrid LOBs (RS+CS): Small values (typically below 1 KB), in memory like normal column values
  2. Packed / midsize LOBs : Midsize values (typically between 1 KB and 4 KB), packed with other LOBs into one container on disk
  3. File LOBs: Large values (typically above 4 KB): one container on disk per LOB

You can use the table data collector (attached to note 1969700) to identifiy which table stores the LOBs in which way:
HANA_Tables_DataCollector*

SCHEMA_NAME TABLE_NAME  COLUMN_NAME   LOB_TYPE DISK_GB BINARY_GB LOB_COUNT
schema table CONTENT PACKED 3.00 2.90 1215200
schema table CONTENT FILE 1.20 0.80 1215200

VARBINARY can also hold quite large binary data (up to 5000 byte), but it doesn’t belong to the LOB data types. As a consequence the hybrid LOB feature can’t be used and the column is loaded into memory when it is accessed.

When you upgrade to SAP_BASIS >= 7.53 the previous VARBINARY columns remain in place until an ABAP table conversion happens. News tables are created with a LOB data type.

As a consequence many potentially large tables with large LRAW columns (<= 5000 byte) are completely loaded into memory (e.g. BALDAT, PCL2, SOC3, SWNCMONI, SWWCNTP0). In order to reduce the memory consumption you can consider taking advantage of paged attributes / Native Storage Extension (SAP Note 2799997) so that unused column pages are evicted from memory in an LRU manner. Please also see SAP Notes 2816823 and 2898319 to safeguard NSE usage in table conversion events. Be sure that your revision is at least on Rev. 57 to avoid issues with the LOBs.

LRAW length limit SAP_BASIS SAP HANA Data Type
<= 1000 all VARBINARY
1001 – 5000 <= 7.52 VARBINARY
1001 – 5000 >= 7.53 BLOB / CLOB
> 5000 all BLOB / CLOB

There is also a LOB garbage collection (=LGC) for deleted / fragmented data.
You can also trigger a manually GC for all CS tables:
ALTER SYSTEM RECLAIM COLUMN LOB SPACE
for a defined CS table:
ALTER TABLE “<table_name>” WITH PARAMETERS (‘LOB_GARBAGE_COLLECTION’ = ‘1’)
For row store tables it is not possible to trigger LOB garbage collection on table level.

ALTER SYSTEM RECLAIM [ROW|COLUMN] LOB SPACE [ALL] The ALL key word is available as of SAP HANA 2.00 and makes sure that also LOBs of tables are cleaned that are currently not loaded into memory.

Be aware that LOB garbage collection itself doesn’t purge the LOB files on disk level. This is handled at a later time by the persistence garbage collection. Starting with SAP HANA 2.0 SPS 06 you can alternatively use the SAP HANA job scheduler to trigger column store LOB garbage collection at specific times (SAP Note 3147465).

indexserver.ini -> [lobhandling] -> garbage_collect_daily_schedule_s -> 3600
Available starting with SAP HANA 2.0 SPS 00 as alternative to garbage_collect_interval_s (see below), inactive per default with SAP HANA <= 2.0 SPS 03
Specifies the time in seconds since midnight when the volume-scan LOB garbage collection is executed on a daily basis. A negative value pauses the daily scan.

indexserver.ini -> [lobhandling] -> garbage_collect_interval_s -> 900 (<= 2.00.053) / 43200 (>= 2.00.054)
Defines the time interval of (table-wise) column store LOB garbage collections, value 0 disables LOB garbage collection

Specifies the time in seconds between table-wise LOB garbage collection runs. If the interval is zero the table-wise LOB garbage collection is paused.

Attention: When garbage_collect_daily_schedule_s is explicitly set on SAP HANA <= 2.0 SPS 03, garbage_collect_interval_s is ignored.

indexserver.ini -> [row_engine] -> garbage_lob_file_handler_enabled -> true
Value ‘true’ activates row store LOB garbage collection
indexserver.ini -> [row_engine] -> garbage_lob_file_handler_interval -> 10
Defines the time interval of row store LOB garbage collections, value 0 disables LOB garbage collection

In some patched environments both the table- and the volume-wise LGC is active.This is not neccessary! Only the volume-wise LOB garbage collection can remove LOBs belonging to previously removed columns, partitions or tables. Therefore, it is possible to completely disable the table-wise LOB garbage collection to eliminate its impact on system performance, but it is not recommended to disable the volume-wise LOB garbage collection for productive cases with LOB load.

SPS06
SchedulerJobs can only execute procedures, therefore the following command will create a procedure that triggers the LOB garbage collection:

CREATE PROCEDURE PROC_RECLAIM_COLUMN_LOB_SPACE AS BEGIN
EXEC 'ALTER SYSTEM RECLAIM COLUMN LOB SPACE';
END;

The following example defines the job to be executed once a week on Sunday beginning at 3 am at night:

CREATE SCHEDULER JOB RECLAIM_COLUMN_LOB_SPACE CRON '* * * SUN 3 0 0' ENABLE PROCEDURE PROC_RECLAIM_COLUMN_LOB_SPACE;

SQL Collection (1969700)

HANA_GarbageCollection_LOBs

Here are the default values (SPS05 – some revisions may differ):

indexserver.ini -> [persistence] -> midsizelob_threshold = 0
global.ini -> [sql] -> lob_memory_threshold
global.ini -> [sql] -> default_lob_storage_type = hybrid
indexserver.ini -> [persistence] -> disposition_lob_read = 1
indexserver.ini -> [persistence] -> disposition_lob_read_small = 1
indexserver.ini -> [persistence] -> disposition_lob_write = 1
indexserver.ini -> [persistence] -> disposition_lob_write_small = 1
indexserver.ini -> [lobhandling] -> garbage_collect_interval_s = 43200
# When garbage_collect_daily_schedule_s is explicitly set on SAP HANA <= 2.0 SPS 03, garbage_collect_interval_s is ignored
indexserver.ini -> [row_engine] -> garbage_lob_file_handler_enabled = true
# row Value ‘true’ activates row store LOB garbage collection
indexserver.ini -> [row_engine] -> garbage_lob_file_handler_interval = 10
# Defines the time interval of row store LOB garbage collections, value 0 disables LOB garbage collectionglobal.ini -> [persistence] -> lob_page_cache_max_size => 10-20% GAL
global.ini -> [persistence] -> lob_page_cache_reclaim_target_size = 5%
global.ini -> [persistence] -> lob_page_cache_reclaim_threshold_size = 10%

This means you can limit the size of in memory lobs which are placed as ressource killer. For sure it is a trade-off but are LOB data really business critical? You can save a lot of resources at the beginning of your sizing.

Check ID Description SAP note
M0371 Unused space in packed LOBs vs. data % 2220627
M0372 Unused space in file LOBs vs. data % 2220627
M0373 Tables with many small file LOBs 2220627
M0374 Tables with many disk LOBs 2220627
M0450 Tables with memory LOBs > 2GB 1994962
M0849 Orphan disk LOBs 2220627
M0850 Time since last CS LOB garb. coll. (days) 2220627

Check out note 1999993 – How-To: Interpreting SAP HANA Mini Check Results how to handle them.

Memory leak

Starting with JDBC 2.3.37 JDBC uses transactional LOBs. As a consequence database requests returning LOB data are no longer closed at the end of the statement execution. As a consequence statements can remain in SUSPENDED state for a long time, introducing various trouble like blocked garbage collection or growing Pool/Statistics and Pool/RowEngine/QueryExecution allocator sizes. This affects e.g. all Solution Managers Java stacks!
This can be fixed by setting transactionalLobs=false as stated out in 2711824 – High Number of Prepared Statements Causing High Usage of Memory Allocator Pool/Statistics.

I/O issues

The critical phase can also take longer in context of many disk LOBs (SAP Note 2220627) and a large Pool/PersistenceManager/PersistentSpace/DefaultConverter/ConvPage allocator (SAP Note 1999997, issue number 257466). The FlushResourcesThread is in this case busy clearing converter pages in a call stack like:

memset_impl
PageAccess::ConverterIndexPageControlBlock::resetFlushPerSPCounter
DataAccess::PersistentSpaceImpl::resetFlushPerSPCounter
ResourceManager::WaitAndSwitchCounterResource::determineFlushDestination

In this scenario each critical phase has a certain duration, even if no I/O writes are done. Reducing the number of disk LOBs in the system can reduce the runtime of the critical phase. With SAP HANA >= 2.00.048.04 and >= 2.00.054 this scenario is optimized and significantly shorter critical phases can be expected.

Following known thread methods:

GarbageLobFileHandler
RSLobGarbageRemover
LobGarbageCollector
LobGarbageCollectorJob

Also for LOBs Martin has created some amazing scripts packed into the SQL Collection (1969700):

HANA_GarbageCollection_LOBs

HANA_LOBs_HybridLOBActivation_CommandGenerator

HANA_LOBs_LOBFiles

HANA_GarbageCollection_LOBs_ColumnStore_Reclaims_1.00.122.17+

HANA_LOBs_LOBSizeHistogram_CommandGenerator

In the end LOBs should not be placed for a long time into a in-memory database like HANA. But for some circumstances it is not possible for every scenario. Here you have to find a traid-off between performance and ressources. Therefor you have to tune the thresholds of the parameters. This can save a lot of memory also during the initial sizing of the system.