In my previous article about Content Modeling in Alfresco, we had gone through how to create custom content model, how to define custom content types and properties within them.
Now whenever content is being uploaded in alfresco it will be associated with any one of the content type and when it is connected to content type it will get all related properties attached to it. Those set of properties helps to make content searchable.
Now searching of contents based on indexes are depend on the way Lucene tokenize and index those properties. The indexing behavior of each property can be set in the content model. By default, they are indexed atomically. The property value is not stored in the index, and the property is tokenized when it is indexed.
Indexing can be controlled by following attributes within content model property definition.
If this is false, there will be no entry for this property in the index.
If this is true, the property is indexed in the transaction if not the property is indexed in the background.
Indexing of content that requires transformation before being indexed (e.g. PDFs) will only obey Atomic=true if the transformation takes less time than the value specified for lucene.maxAtomicTransformationTime.
If true, the property value is stored in the index and may be obtained via the Lucene low-level query API.
This can be useful while debugging systems to see exactly what is being indexed, but do not set this to true on production systems.
If “true”, the string value of the property is tokenized before indexing
if “false”, it is indexed “as is” as a single string
if “both” then both forms above are in the index
All content is not stored, indexed, and tokenized, if it is indexed.
Tokenizer is determined by the property type in the data dictionary. This is locale sensitive as supported by the data dictionary, so you could switch to tokenize all your content in any other language. At the moment you cannot mix other language and English tokenization.
<type name="cm:content"> <title>Content</title> <parent>cm:cmobject</parent> <properties> <property name="cm:content"> <type>d:content</type> <mandatory>false</mandatory> <index enabled="true"> <atomic>false</atomic> <stored>false</stored> <tokenised>true</tokenised> </index> </property> </properties> </type>
Summary: There are many ways you can leverage Lucence search capabilities but this is to give you clarity on how to control indexing of alfresco content property.