powered by CADENAS

Manual

Manual

1.7.5.9.5.2. Caching / geometric search index as of V9.08

1.7.5.9.5.2.1. Structure of indices

As of version 9.08 there are some changes concerning storage location and files:

Storage location

$CADENAS_DATA\index\cat\...\

The Geo Index and the Topology Index are saved separately now.

The new Geo Index is found in the subfolder geoindexv2 (beneath the "old" geoindex), the Topo Index in the folder topoindex.

The following files are found in the directory geoindexv2:

  1. geomsearch.fdb:

    This file contains the fingerprints for the single algorithms.

  2. geomsearch.cidb:

    This file contains information on the part, which is only relevant at the generation of the fingerprints, meaning that this information has not to be read during the search.

  3. geomsearch.ldx:

    This file contains the Linear Index for the single search templates.

  4. geomsearch.ofm: This file contains a mapping of project paths on the internal IDs of the Geo Index.

    In a certain mode the updates of the Geo Index are not directly integrated in the Geo Index, but saved in special files, in order to accelerate the procedure. More information on this is found under Section 1.7.5.9.5.2.3, “ Creation of Geo and Topo indices”. Therefor the following files are used:

    • geomsearch.ufdb: Contains fingerprints of changes.

    • geomsearch.ucidb: Contains information of changes, which are not relevant for the search itself.

    • geomsearch.uop: Contains all needed information in order to be able to integrate the fingerprint changes into the Geo Index. Possibly the Geo Index is just being worked on.

      In this case another file is found in this directory:

      geomsearch.lock: Prevents from accessing the Geo Index during writing on it.

The following files are found in the directory topoindex:

  1. topoindex.bin: This file contains the structured Topo data.

  2. topoindex.idx: This file contains indices for searching on the data.

    For the Topo Index there is also the possibility of quick updates. Then the following files are in the directory in addition:

    • topoindex.dupd: Structure data of changes

    • topoindex.upd: Information, in order to be able to integrate the changes into the index. Possibly this directory is also locked.

    • topoindex.lock: Prevents from accessing the Topo Index during writing on it.

1.7.5.9.5.2.2.  geomsearch.cfg -> Block [settings] - Common settings

General settings under $CADENAS_SETUP/geomsearch.cfg are found in the section [settings].

  1. Key: searchindex

    Determine, which Geo Index shall be used for searching. Possible values are:

    • old: Use old index

    • new: Use new index

    • both: Use new index - if not available the old one

  2. Key: toposearchindex

    Determine, which Topo Index shall be used for searching. Possible values are:

    • old: Use old Geo Index

    • new: Use new index

    • both: Use new index - if not available the old Geo Index

  3. Key: convertindex and converttopoindex

    Conversion of old indices in new indices. Possible values:

    • 0: Conversion deactivated

    • 1: Conversion activated. In this case the conversion happens automatically, when an old index is created.

  4. Key: createNewDirectlyr

    Control, whether the old or the new index shall be created. Possible values:

    • 0: Create old index

    • 1: Create new index

  5. Key: createNewLinIndex

    Shall the Linear Index be created again at the conversion? Possible values:

    • 0: Convert old Linear Index. Since this index uses less Pivot elements, the search is a little bit slower, although the conversion is quicker.

    • 1: Create new Linear Index.

1.7.5.9.5.2.3.  Creation of Geo and Topo indices

For the creation of indices there are three basic possibilities:

  1. New creation of the Geo Index

  2. Update of the existing index (working on copy). Here the available index can also be an old Geo Index. This, then is automatically converted.

  3. Update of the existing index (working with special update files). Makes sense, if there are only little changes at large catalogs.

    • Variant 3 is much quicker the variant 2, especially at large catalogs.

    • Variant 3 has the disadvantage, that the search becomes slower with each new update. That's why it makes sense to update the index using mode 2 from time to time, because then the updates are completely integrated in the index.

    • The Pivot elements of the Linear Index are not calculated anew in variant 3. Possibly this can have a negative effect on search times.

Further notes

  • The creation of Geo and Topo Index always happens together.

  • Basically applies that the 64 Bit variant has a better performance (similar as with the migration).

  • During the generation of the indices the directories of Geo and Topo index are locked. The variants 1 and 2 are working on a temporary copy. Here the directory is only locked for writing access at the beginning. This means, that the index can still be read. The index has to be locked for reading access only for a short moment, when the old files are deleted and the temporary files are renamed. The 3rd variant does not work on a copy. That's why the access is locked during the whole update.

  • Via key ThreadCountForLinearIndexCreator in the section settings_32 or settings_64 is can be determined, how many threads have to be used for the creation of the Linear Index. For value "-1" the existing number of cores is used. Possibly each thread needs some hundred MB working memory. That's why, too high values make no sense for 32 Bit.

    Example:

    [settings_32]
    ThreadCountForLinearIndexCreator=2
    
    [settings_64]
    ThreadCountForLinearIndexCreator=-1

1.7.5.9.5.2.4. Changes at the actual search

For Geo and Topo index the are 3 different modes for performing the search for each, depending on the entry under geomsearch.cfg, section settings (see above).

  • new: Only the new index is used. If not available, there are no results for this catalog.

  • both: If possible the new index is used. Not available indices are converted in the working memory.

  • old: Search with old indices

Furthermore the Linear Index is also used for the sketch search as of version 9.08 in order to reduce search times.

1.7.5.9.5.2.5.  Caching settings

All caching settings can be set differently for the PARTdataManager 32 Bit and 64 Bit variant.

Geo search
  • LinIndexCacheSize: Cache size for the Linear Index in KB

    Choose the value in a way, that it is not maximally used.

    If this is not possible, then set a small value for the superordinated storage (SampleLineCacheSize).

    [CACHEV2_GEO_SEARCH_32]
    LinIndexCacheSize=100000

    [CACHEV2_GEO_SEARCH_64]
    LinIndexCacheSize=300000

  • OffsetCacheSize: Cache size for the Offset index in KB

    Choose the value in a way, that it is not maximally used.

    If this is not possible, then set a smaller value for the superordinated storage (SampleLineCacheSize).

    [CACHEV2_GEO_SEARCH_32]
    OffsetCacheSize=50000

    [CACHEV2_GEO_SEARCH_64]
    OffsetCacheSize=150000

  • GeoIndexV2CacheSize: Number of Geo indices, which can be opened at the same time.

    Set the value in a way, that it is according to the maximal number of catalogs.

    [CACHEV2_GEO_SEARCH_32]
    GeoIndexV2CacheSize=1000

    [CACHEV2_GEO_SEARCH_64]
    GeoIndexV2CacheSize=1000

  • SampleLineCacheSize: Cache for fingerprints in KB

    Cache for all Threads (is normally according to the number of all processor cores altogether)

    [CACHEV2_GEO_SEARCH_32]
    SampleLineCacheSize=100000

    [CACHEV2_GEO_SEARCH_64]
    SampleLineCacheSize=500000

  • LogFileName: Here the log information is saved, if not empty. See below.

    Set the key yourself, if not existing.

    [CACHEV2_GEO_SEARCH_32]
    LogFileName=c:\log\cachev2_geo_search_32.log

    [CACHEV2_GEO_SEARCH_64]
    LogFileName=c:\log\cachev2_geo_search_64.log

[Important] Important

When setting values for the Geo search please regard the following rules:

  1. Set LinIndexCacheSize and OffsetCacheSize in a size, that they are not completely exhausted. If this is not possible, then set a smaller value for the superordinated storage SampleLineCacheSize.

  2. If there is some memory left, set SampleLineCacheSize as large as possible.

Including / excluding catalogs

The setting can be used for Server environments in order to exclude catalogs or to load only special catalogs.

#:VALS_S
#:HELP;default;Include catalog, if the expression matches.
PreloaderIncludeRegexPos=
#:HELP;default;Include catalog, if the expression doesn't match.
PreloaderIncludeRegexNeg=
#:VALS_S
#:HELP;default;Exlucde catalog, if the expression matches.
PreloaderExcludeRegexPos=.*copyright\.prj$|.*_qa$|.*_dev$
#:VALS_S
#:HELP;default;Exclude catalog, if the expression doesn't match.
PreloaderExcludeRegexNeg=

Also compare $CADENAS_USER/varsearch.cfg -> [VariableSearch:Path]:

The setting under geomsearch.cfg is used to cache the geometrical index, the one under $CADENAS_USER/varsearch.cfg [VariableSearch:Path] is used to cache the index for variable and full-text search.

Topo search
  • ObjectDataCacheSize: Cache for Topo data nodes

    Especially important for the migration.

    [CACHE_TOPO_SEARCH_32]
    ObjectDataCacheSize=200000

    [CACHE_TOPO_SEARCH_64]
    ObjectDataCacheSize=1000000

  • IndexCacheSize: Cache for indices on the Topo data.

    Especially important for the Topo search.

    [CACHE_TOPO_SEARCH_32]
    IndexCacheSize=200000

    [CACHE_TOPO_SEARCH_64]
    IndexCacheSize=500000

  • LogFileName: Here log information is saved, if not empty. Also see Section 1.7.5.9.5.2.6, “ Log file evaluation - Find best settings”.

    Set the key yourself, if not available.

    [CACHE_TOPO_SEARCH_32]
    LogFileName=c:\log\cachev2_topo_search_32.log

    [CACHE_TOPO_SEARCH_64]
    LogFileName=c:\log\cachev2_topo_search_64.log

1.7.5.9.5.2.6.  Log file evaluation - Find best settings

With the help of the log files (key LogFileName), mentioned above, the search settings (thus the search behavior) can be optimized for the existing data. You can learn from the log file how full the respective cache is and how often an element has really been in the cache at each access (Cache hits).

Example:

Search from Do 12. Dez 22:41:09 2013
Thread: 0xce8

Geometrical index cache: 
Capacity of the cache: 1000
In use: 18 (1.80%)
Free: 982 (98.20%)
Accesses to the cache: 6489
Cache hits: 6471 (99.72%)
Cache misses: 18 (0.28%)

Linear index cache: 
Capacity of the cache: 100000
In use: 4962 (4.96%)
Free: 95038 (95.04%)
Accesses to the cache: 1932
Cache hits: 1615 (83.59%)
Cache misses: 317 (16.41%)

Offset index cache: 
Capacity of the cache: 50000
In use: 1617 (3.23%)
Free: 48383 (96.77%)
Accesses to the cache: 7033
Cache hits: 6716 (95.49%)
Cache misses: 317 (4.51%)

Sample lines cache: 
Capacity of the cache: 100000
In use: 82785 (82.78%)
Free: 17215 (17.22%)
Accesses to the cache: 16840
Cache hits: 10985 (65.23%)
Cache misses: 5855 (34.77%)

Some notes for the interpretation of data:

  1. Possibly the log file is not updated until a further search is performed.

  2. In order to make effective settings with the help of the log file, it is important to perform several searches, which are adequate to the normal user behavior, for example concerning the selection of search templates, sketch search or 3D search, but also concerning the selection of search parts.

  3. The information on cache hits and cache misses, which are readout with each search, are cumulative.

  4. The data are loaded into the cache not until the first access has been performed. That's why the first access on data is always a cache miss. The more searches have been performed, the less cache misses should occur.

1.7.5.9.5.2.7. Access on Topology values via VBS

You can access the Topo values in the following way:

' main class to manage topology
set topoManager = CreateObject("cnstools.topomanager")

' fetch root node of topology tree
set catalogRoot = topoManager.findCatalogRoot("cat/norm/din")
stdprint("Number of project in din: " & catalogRoot.childCount)

' fetch project node
set prjNode = topoManager.findProjectNode("cat/norm/din", "anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj")
stdprint("Number of lines in anlagenbau/armaturen/din_11864_1_a_asmtab.prj: " 
 & prjNode.childCount)

' fetch line node
set lineNode = topoManager.findLineNode("cat/norm/din", "anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj", 20)

' helper to recursively print all the attributes of a node
sub printAttributes(node, indent)
	stdprint(indent & node.name)
	c = node.childCount
	for j = 0 to c - 1
		printAttributes(node.child(j), indent & "  ")
	next
	a = node.attributeCount
	for j = 0 to a - 1
		set attr = node.attribute(j)
		value = attr.value
		valueAsString = ""
		set attrType = attr.type
		if attrType = "doubleVec" then
			n = value.count
			for k = 0 to n - 1
				if k > 0 then
					valueAsString = valueAsString & ", "
				end if
				valueAsString = valueAsString & value.item(k)
			next
		else
			valueAsString = value
		end if
		stdprint(indent & "  " & attr.name & ": " & valueAsString)
	next
end sub

' print all attributes for a line
stdprint("Recursive list of all attributes in line 20:")
stdprint()
printAttributes(lineNode, "")
stdprint()

' print all attributes for a stl file
stdprint("Recursive list of all attributes in stl:")
stdprint()
set stlNode = topoManager.createNodeFrom3DFile("D:/stl/1 stl/001952002.stl", "STLFILE")
printAttributes(stlNode, "")

' print all attributes for a prt file
stdprint()
stdprint("Recursive list of all attributes in prt file:")
stdprint()
set prtNode = topoManager.createNodeFrom3DFile("D:/stl/ein paar proe-Dateien/
 1202t4100_gen.prt.1", "NATWILDFIREPART 5 32 BIT")
printAttributes(prtNode, "")

' print all attributes for a line (create fingerprints on the fly)
stdprint()
stdprint("Recursive list of all attributes in line 420:")
stdprint()
set lineNode2 = topoManager.createNodeFromProject("cat/norm/din", "anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj", 420)
printAttributes(lineNode2, "")

' print all attributes for a project (create fingerprints on the fly)
stdprint()
stdprint("Recursive list of all attributes in project anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj:")
stdprint()
set prjNode2 = topoManager.createNodeFromProject("cat/norm/din", "anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj", -1)
printAttributes(prjNode2, "")