powered by CADENAS

Manual

Manual

1.1.6.7.10.3. Cache search results in working memory

Most time spent during a search is used for the reading of data from a hard drive. Therefore the search is faster if results can be cached in a working memory.

Caching is especially recommended for use with the search server.

[Note] Note

When restarting the PARTdataManager or the search server the cache is lost. Therefore, an effective caching is only possible within a search server, since this is usually only restarted in extreme cases.

[Note] Note

Beside the here described caching there is still another between the index files and the search server.

Important information on this is found under Section 1.1.6.7.9.2, “Caching index files of $CADENAS_DATA on Search server ”.

In the configuration file geomsearch.cfg there are different sections for performing the caching:

  • Caching for versions up to 9.07

    [Cache_Local_Search]
    [Cache_Server_Search]

    [Note] Note

    The settings possibly remain relevant also for versions as of 9.08, because these can alternatively use the "old" search index.

  • Caching for versions as of 9.08

    [CACHEV2_GEO_SEARCH_32]
    [CACHEV2_GEO_SEARCH_64]

In the following the sections are explained separately:

1.1.6.7.10.3.1. Caching for versions up to 9.07

1.1.6.7.10.3.1.1. Description of different caches

Extract from $CADENAS_SETUP -> geomsearch.cfg.

[Cache_Local_Search]
maxOpenIndexCount=100
linIndexCacheSize=0
sampleLineListCacheSize=0
pivotDistListCacheSize=0
logFileName=

[Cache_Server_Search]
maxOpenIndexCount=100
linIndexCacheSize=150000
sampleLineListCacheSize=800000
pivotDistListCacheSize=50000
logFileName=

[Note] Note

The cache can be adjusted for the local search and for the search via search server separately.

Caching works also with PARTdataManager, but cannot be run effectively here, because the cache is deleted with every boot procedure. That's why the values linIndexCacheSize, sampleLineListCacheSize and pivotDistListCacheSize are set on '0' for the local search.

In the following you find an explanation for the individual caches:

  1. GeoIndexCache: Prevents the index from being reopened for every new search.[13]The set value corresponds to the maximum number of open indexes.

    Set the value to the number of catalogs to be searched through (in this case for example '100').

    maxOpenIndexCount=100

  2. sampleLineListCache:

    Cache for fingerprints

    (Cache is together for all Threads [usually corresponds to the number of processor cores] (compare $CADENAS_SETUP/partsol.cfg -> Block "SEARCHSERVER" -> key "THREADS")

    Example: 80% of 1GB available RAM (indication in KB)

    sampleLineListCacheSize=800000

    [Recommendation: For first setting 80%]

  3. linIndexCache: (not used for search of sketches)

    Cache for Linear Index

    Example: 15% of 1GB available RAM (indication in KB)

    linIndexCacheSize=150000

    [Recommendation: For first setting 15%]

  4. pivotDistListCache: (not used for search of sketches)

    Cache for Linear Index:[14]

    (Cache is for all threads together [usually corresponds to the number of processor cores] (compare $CADENAS_SETUP/partsol.cfg -> Block "SEARCHSERVER" -> key "THREADS")

    Example: 5% of 1GB available RAM (information in KB)

    pivotDistListCacheSize=50000

    [Recommendation: For the initial setting 5%]

1.1.6.7.10.3.1.2. Log file evaluation - Find best settings

You can optimize the settings in 2 steps:

  1. Set the settings in the first step according to general experience:

    1. Determine the percentage of working memory you can allocate to the cache without restricting other processes.

    2. Divide these up in the working memory as follows:

      • sampleLineListCacheSize: 80%

      • linIndexCacheSize: 15%

      • pivotDistListCacheSize: 5%

    3. Enter the result values in KB into the above named keys.

    4. Set the key value from maxOpenIndexCount to the number of catalogs to be searched through..

    [Note] Note

    If you set all values to '0', the caching is deactivated.

  2. Optimization of values according to the evaluation of the log file

    In the configuration file geomseach.cfg you should assign where the log file should be saved.

    [Note] Note

    After each search, a report is given indicating how many cache hits there were and how much storage space was used. The values are then not set back after the search. The statistic runs across all searches!

    When restarting the PARTdataManager or the Search server the log file is deleted

    In order to reach the optimal settings of the geometric search, please not the following before assessing the log file.

    • Conduct searches which are representative for a normal user behavior, for example at the selection of search templates, sketch search or 3D search, but also at the selection of search parts in addition.

    • Ideally, let the search run for several days (search server), or over a long period of using the PARTdataManager, before evaluating the log file.

Example for evaluation the log file

GeoIndexCache CacheHits 999 of 1000, 99%
GeoIndexCache Files 99 of 100, 99%

SampleLineListCache CacheHits 999 of 1000, 99%
SampleLineListCache Memory 400000 of 800000, 50%

LinIndexCache CacheHits 10617 of 10776, 98%
LinIndexCache Memory 90000 of 100000, 90%

PivotDistCache CacheHits 100 of 10000, 1%
PivotDistCache Memory 9999 of 10000, 99%

For the GeoIndexCache you need not change anything. It is set to the number of catalogs to be searched through.

For the other three caches, you must pay attention to the following rules: (Here explained as an example for SampleLineListCache. The information LinIndexCache and PivotDistCache may be transferred.)

CacheHits

The first row shows the number of hits = measurement for quality

SampleLineListCache CacheHits 900 of 1000, 90%

  1. The first value (here 900) shows the number of accesses on elements that are already available in the cache.

    The second value (here 1000) shows the entire number of accesses.

  2. The second value shows the relationship between the two values in percent. Here: 90%.

Memory

The second row shows the use of the working memory:

SampleLineListCache Memory 600000 of 800000, 75%

  1. The first two values indicate how many KB were used from the KB value set in the configuration file.

    In this example 600000 KB of 800000 KB were used.

  2. The second value shows the relationship between the two values in percent. Here: 75%.

CacheHits is the measurement for the quality of the cache use. If this value is high, the settings are OK.

Memory gives information whether the value set in the configuration file is properly dimensioned. If you have 100% for CacheHits and 10% for Memory you can get just as many good CacheHits as with decidedly less allocations.

If the CacheHits are low (e.g. 10%) and the set cache is used completely (e.g. 100%), one should attempt to boost the hit rate by increasing the cache value.

1.1.6.7.10.3.2. Caching for Versions as of 9.08 / New Geometric Search Index

In order to improve the performance at the GeoSearch and Topology Search, the structure of the Geo Index has been completely revised. In the following you can find some important information on this.

1.1.6.7.10.3.2.1. New structure of indices

As of version 9.08 there are some changes concerning storage location and files:

Storage location

$CADENAS_DATA\index\cat\...\

The Geo Index and the Topology Index are saved separately now.

The new Geo Index is found in the subfolder geoindexv2 (beneath the "old" geoindex), the Topo Index in the folder topoindex.

The following files are found in the directory geoindexv2:

  1. geomsearch.fdb:

    This file contains the fingerprints for the single algorithms.

  2. geomsearch.cidb:

    This file contains information on the part, which is only relevant at the generation of the fingerprints, meaning that this information has not to be read during the search.

  3. geomsearch.ldx:

    This file contains the Linear Index for the single search templates.

  4. geomsearch.ofm: This file contains a mapping of project paths on the internal IDs of the Geo Index.

    In a certain mode the updates of the Geo Index are not directly integrated in the Geo Index, but saved in special files, in order to accelerate the procedure. More information on this is found under Section 1.1.6.7.10.3.2.4, “Creation of Geo and Topo indices”. Therefor the following files are used:

    • geomsearch.ufdb: Contains fingerprints of changes.

    • geomsearch.ucidb: Contains information of changes, which are not relevant for the search itself.

    • geomsearch.uop: Contains all needed information in order to be able to integrate the fingerprint changes into the Geo Index. Possibly the Geo Index is just being worked on.

      In this case another file is found in this directory:

      geomsearch.lock: Prevents from accessing the Geo Index during writing on it.

The following files are found in the directory topoindex:

  1. topoindex.bin: This file contains the structured Topo data.

  2. topoindex.idx: This file contains indices for searching on the data.

    For the Topo Index there is also the possibility of quick updates. Then the following files are in the directory in addition:

    • topoindex.dupd: Structure data of changes

    • topoindex.upd: Information, in order to be able to integrate the changes into the index. Possibly this directory is also locked.

    • topoindex.lock: Prevents from accessing the Topo Index during writing on it.

1.1.6.7.10.3.2.2. General settings

General settings under $CADENAS_SETUP/geomsearch.cfg are found in the section [settings].

  1. Key: searchindex

    Determine, which Geo Index shall be used for searching. Possible values are:

    • old: Use old index

    • new: Use new index

    • both: Use new index - if not available the old one

  2. Key: toposearchindex

    Determine, which Topo Index shall be used for searching. Possible values are:

    • old: Use old Geo Index

    • new: Use new index

    • both: Use new index - if not available the old Geo Index

  3. Key: convertindex and converttopoindex

    Conversion of old indices in new indices. Possible values:

    • 0: Conversion deactivated

    • 1: Conversion activated. In this case the conversion happens automatically, when an old index is created.

  4. Key: createNewDirectlyr

    Control, whether the old or the new index shall be created. Possible values:

    • 0: Create old index

    • 1: Create new index

  5. Key: createNewLinIndex

    Shall the Linear Index be created again at the conversion? Possible values:

    • 0: Convert old Linear Index. Since this index uses less Pivot elements, the search is a little bit slower, although the conversion is quicker.

    • 1: Create new Linear Index.

1.1.6.7.10.3.2.3. Migration of indices

Via the Migration tool under PARTadmin -> Category -> Catalog update -> Migration (see Section 1.1.3.3.4, “ Migration ”) you can migrate any number of catalogs at once. It is recommended to use the PARTadmin 64-Bit variant for this, because the conversion is much quicker then, because there is more working memory available. The Standards catalog can only be migrated as a whole. It is not possible to only migrate norm/din for example.

1.1.6.7.10.3.2.4. Creation of Geo and Topo indices

For the creation of indices there are three basic possibilities:

  1. New creation of the Geo Index

  2. Update of the existing index (working on copy). Here the available index can also be an old Geo Index. This, then is automatically converted.

  3. Update of the existing index (working with special update files). Makes sense, if there are only little changes at large catalogs.

    • Variant 3 is much quicker the variant 2, especially at large catalogs.

    • Variant 3 has the disadvantage, that the search becomes slower with each new update. That's why it makes sense to update the index using mode 2 from time to time, because then the updates are completely integrated in the index.

    • The Pivot elements of the Linear Index are not calculated anew in variant 3. Possibly this can have a negative effect on search times.

Further notes

  • The creation of Geo and Topo Index always happens together.

  • Basically applies that the 64 Bit variant has a better performance (similar as with the migration).

  • During the generation of the indices the directories of Geo and Topo index are locked. The variants 1 and 2 are working on a temporary copy. Here the directory is only locked for writing access at the beginning. This means, that the index can still be read. The index has to be locked for reading access only for a short moment, when the old files are deleted and the temporary files are renamed. The 3rd variant does not work on a copy. That's why the access is locked during the whole update.

  • Via key ThreadCountForLinearIndexCreator in the section settings_32 or settings_64 is can be determined, how many threads have to be used for the creation of the Linear Index. For value "-1" the existing number of cores is used. Possibly each thread needs some hundred MB working memory. That's why, too high values make no sense for 32 Bit.

    Example:

    [settings_32]
    ThreadCountForLinearIndexCreator=2
    
    [settings_64]
    ThreadCountForLinearIndexCreator=-1

1.1.6.7.10.3.2.5. Changes at the actual search

For Geo and Topo index the are 3 different modes for performing the search for each, depending on the entry under geomsearch.cfg, section settings (see above).

  • new: Only the new index is used. If not available, there are no results for this catalog.

  • both: If possible the new index is used. Not available indices are converted in the working memory.

  • old: Search with old indices

Furthermore the Linear Index is also used for the sketch search as of version 9.08 in order to reduce search times.

1.1.6.7.10.3.2.6. Caching settings

All caching settings can be set differently for the PARTdataManager 32 Bit and 64 Bit variant.

Geo search
  • LinIndexCacheSize: Cache size for the Linear Index in KB

    Choose the value in a way, that it is not maximally used.

    If this is not possible, then set a small value for the superordinated storage (SampleLineCacheSize).

    [CACHEV2_GEO_SEARCH_32]
    LinIndexCacheSize=100000

    [CACHEV2_GEO_SEARCH_64]
    LinIndexCacheSize=300000

  • OffsetCacheSize: Cache size for the Offset index in KB

    Choose the value in a way, that it is not maximally used.

    If this is not possible, then set a smaller value for the superordinated storage (SampleLineCacheSize).

    [CACHEV2_GEO_SEARCH_32]
    OffsetCacheSize=50000

    [CACHEV2_GEO_SEARCH_64]
    OffsetCacheSize=150000

  • GeoIndexV2CacheSize: Number of Geo indices, which can be opened at the same time.

    Set the value in a way, that it is according to the maximal number of catalogs.

    [CACHEV2_GEO_SEARCH_32]
    GeoIndexV2CacheSize=1000

    [CACHEV2_GEO_SEARCH_64]
    GeoIndexV2CacheSize=1000

  • SampleLineCacheSize: Cache for fingerprints in KB

    Cache for all Threads (is normally according to the number of all processor cores altogether)

    [CACHEV2_GEO_SEARCH_32]
    SampleLineCacheSize=100000

    [CACHEV2_GEO_SEARCH_64]
    SampleLineCacheSize=500000

  • LogFileName: Here the log information is saved, if not empty. See below.

    Set the key yourself, if not existing.

    [CACHEV2_GEO_SEARCH_32]
    LogFileName=c:\log\cachev2_geo_search_32.log

    [CACHEV2_GEO_SEARCH_64]
    LogFileName=c:\log\cachev2_geo_search_64.log

[Important] Important

When setting values for the Geo search please regard the following rules:

  1. Set LinIndexCacheSize and OffsetCacheSize in a size, that they are not completely exhausted. If this is not possible, then set a smaller value for the superordinated storage SampleLineCacheSize.

  2. If there is some memory left, set SampleLineCacheSize as large as possible.

Topo search
  • ObjectDataCacheSize: Cache for Topo data nodes

    Especially important for the migration.

    [CACHE_TOPO_SEARCH_32]
    ObjectDataCacheSize=200000

    [CACHE_TOPO_SEARCH_64]
    ObjectDataCacheSize=1000000

  • IndexCacheSize: Cache for indices on the Topo data.

    Especially important for the Topo search.

    [CACHE_TOPO_SEARCH_32]
    IndexCacheSize=200000

    [CACHE_TOPO_SEARCH_64]
    IndexCacheSize=500000

  • LogFileName: Here log information is saved, if not empty. Also see Section 1.1.6.7.10.3.2.7, “Log file evaluation - Find best settings”.

    Set the key yourself, if not available.

    [CACHE_TOPO_SEARCH_32]
    LogFileName=c:\log\cachev2_topo_search_32.log

    [CACHE_TOPO_SEARCH_64]
    LogFileName=c:\log\cachev2_topo_search_64.log

1.1.6.7.10.3.2.7. Log file evaluation - Find best settings

With the help of the log files (key LogFileName), mentioned above, the search settings (thus the search behavior) can be optimized for the existing data. You can learn from the log file how full the respective cache is and how often an element has really been in the cache at each access (Cache hits).

Example:

Search from Do 12. Dez 22:41:09 2013
Thread: 0xce8

Geometrical index cache: 
Capacity of the cache: 1000
In use: 18 (1.80%)
Free: 982 (98.20%)
Accesses to the cache: 6489
Cache hits: 6471 (99.72%)
Cache misses: 18 (0.28%)

Linear index cache: 
Capacity of the cache: 100000
In use: 4962 (4.96%)
Free: 95038 (95.04%)
Accesses to the cache: 1932
Cache hits: 1615 (83.59%)
Cache misses: 317 (16.41%)

Offset index cache: 
Capacity of the cache: 50000
In use: 1617 (3.23%)
Free: 48383 (96.77%)
Accesses to the cache: 7033
Cache hits: 6716 (95.49%)
Cache misses: 317 (4.51%)

Sample lines cache: 
Capacity of the cache: 100000
In use: 82785 (82.78%)
Free: 17215 (17.22%)
Accesses to the cache: 16840
Cache hits: 10985 (65.23%)
Cache misses: 5855 (34.77%)

Some notes for the interpretation of data:

  1. Possibly the log file is not updated until a further search is performed.

  2. In order to make effective settings with the help of the log file, it is important to perform several searches, which are adequate to the normal user behavior, for example concerning the selection of search templates, sketch search or 3D search, but also concerning the selection of search parts.

  3. The information on cache hits and cache misses, which are readout with each search, are cumulative.

  4. The data are loaded into the cache not until the first access has been performed. That's why the first access on data is always a cache miss. The more searches have been performed, the less cache misses should occur.

1.1.6.7.10.3.2.8. Access on Topology values via VBS

You can access the Topo values in the following way:

' main class to manage topology
set topoManager = CreateObject("cnstools.topomanager")

' fetch root node of topology tree
set catalogRoot = topoManager.findCatalogRoot("cat/norm/din")
stdprint("Number of project in din: " & catalogRoot.childCount)

' fetch project node
set prjNode = topoManager.findProjectNode("cat/norm/din", "anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj")
stdprint("Number of lines in anlagenbau/armaturen/din_11864_1_a_asmtab.prj: " 
 & prjNode.childCount)

' fetch line node
set lineNode = topoManager.findLineNode("cat/norm/din", "anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj", 20)

' helper to recursively print all the attributes of a node
sub printAttributes(node, indent)
	stdprint(indent & node.name)
	c = node.childCount
	for j = 0 to c - 1
		printAttributes(node.child(j), indent & "  ")
	next
	a = node.attributeCount
	for j = 0 to a - 1
		set attr = node.attribute(j)
		value = attr.value
		valueAsString = ""
		set attrType = attr.type
		if attrType = "doubleVec" then
			n = value.count
			for k = 0 to n - 1
				if k > 0 then
					valueAsString = valueAsString & ", "
				end if
				valueAsString = valueAsString & value.item(k)
			next
		else
			valueAsString = value
		end if
		stdprint(indent & "  " & attr.name & ": " & valueAsString)
	next
end sub

' print all attributes for a line
stdprint("Recursive list of all attributes in line 20:")
stdprint()
printAttributes(lineNode, "")
stdprint()

' print all attributes for a stl file
stdprint("Recursive list of all attributes in stl:")
stdprint()
set stlNode = topoManager.createNodeFrom3DFile("D:/stl/1 stl/001952002.stl", "STLFILE")
printAttributes(stlNode, "")

' print all attributes for a prt file
stdprint()
stdprint("Recursive list of all attributes in prt file:")
stdprint()
set prtNode = topoManager.createNodeFrom3DFile("D:/stl/ein paar proe-Dateien/
 1202t4100_gen.prt.1", "NATWILDFIREPART 5 32 BIT")
printAttributes(prtNode, "")

' print all attributes for a line (create fingerprints on the fly)
stdprint()
stdprint("Recursive list of all attributes in line 420:")
stdprint()
set lineNode2 = topoManager.createNodeFromProject("cat/norm/din", "anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj", 420)
printAttributes(lineNode2, "")

' print all attributes for a project (create fingerprints on the fly)
stdprint()
stdprint("Recursive list of all attributes in project anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj:")
stdprint()
set prjNode2 = topoManager.createNodeFromProject("cat/norm/din", "anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj", -1)
printAttributes(prjNode2, "")



[13] It just has to be opened once for each "Thread" [usually corresponds to the number of processor cores]. Each THREAD has its own cache.

[14] The linear index sorts parts according to distance to certain reference parts. These are called pivots.