TSG has several clients using Documentum as a repository and a custom front end application for consumption of the records or renditions of records. In most cases there is a mechanism in place such as SCS (Site Caching Services) or TSG’s OpenMigrate PUMA (See CIS Case Study for more details). While a typical Documentum application (ex: Webtop) provides a “one stop shop” for authors and approvers, the interface can be challenging when “consumers” are just looking for quick search and retrieval. This solution provides improved performance, business continuity, and ability to add documents from other systems. One potential risk to using a cache of documents and metadata for search and retrieval is the integrity of data. Publishing techniques are designed to accurately cache records; however there are uncontrollable circumstances that may result in a mismatch. Continue reading ‘Documentum to Portal Consistency Checker – Proof of Concept’
Archive for the 'R&D' Category
Documentum to Portal Consistency Checker – Proof of Concept
Published February 29, 2012 Documentum , Open Source , OpenMigrate , R&D , TSG , Webtop Leave a CommentTags: Consistency, data integrity, Documentum, OpenMigrate, portal, PUMA, Webtop
Documentum Transformation Services (DTS) – Alternative Approaches with Adobe LiveCycle and OpenOffice
Published April 8, 2010 Adobe , Alfresco , Documentum , DTS , LiveCycle , Open Source , R&D , Tech Tip Leave a CommentSince the very first Momentum (1996 in a very windy Miami), the Documentum user community has pushed for a more reliable means to convert mostly Microsoft office documents into PDF. Back then, during a wrap-up luncheon, the feedback on AutoRender ( a previous incarnation of DTS) was anything but positive. Similar to some complaints today, some of the main complaints included:
- Having to monitor/reboot the AutoRender Server throughout the day
- Unreliable PDF Transformation included:
- Unsupported Document Types
- Font Replacement
- Broken links
At the time, Documentum threw some engineering effort into AutoRender to address some of the shortcomings. One of the changes was to have AutoRender reboot itself (not really a fix but it did address some of the shortcomings). Like other products from Documentum, TSG is occasionally asked for alternatives. This post will address some of the tools we use in non-Documentum environments that could easily be adapted to the PDF rendition needs for Documentum.
Adobe LiveCycle
For a couple of our non-Documentum customers, we have leveraged the Adobe LiveCycle component PDF Generator. We have been very impressed with their reliability and functionality. Considering Adobe created the best known implementation of Portable Document Format, it makes sense to rely on Adobe technology to convert your native content.
Documentum Full Text Search with Lucene – Honoring ACL Security
Published March 30, 2010 D6 , D6.5 , Documentum , HPI , Lucene , OpenContent , OpenMigrate , R&D , Search , Upgrades , Webtop 2 CommentsThe last post discussed the results of an HPI Lucene Search test compared to a Webtop FAST Search as part of a proof of concept for a client looking to provide a consumer interface. As we have often mentioned on this forum, we continually see clients looking for a better search interface than Webtop, as well as some content cached outside of Documentum for business continuity, performance, and licensing.
One accurate comment raised by the post was that our comparison of HPI/Lucene against a Webtop/FAST search wasn’t really comparing apples to apples as the Webtop search was running against Documentum with security, while the Lucene search was not. While the client’s goals were to show the benefits of the cached repository and Lucene against Documentum, many Documentum users would like to know how Lucene would perform directly against a Documentum repository (as with upcoming DSS).
For this post, we will discuss TSG’s strategy and initial proof of concept results in leveraging Lucene for a Documentum full text search engine.
Continue reading ‘Documentum Full Text Search with Lucene – Honoring ACL Security’
Documentum Search – Lucene versus FAST
Published March 17, 2010 Documentum , HPI , Lucene , Open Source , R&D , Search 12 CommentsAs mentioned in a previous article, many clients are moving to away from FAST in preparation for the eventual release of Documentum Search Services (DSS) slated for release in June that leverages the open source product, Apache Lucene. This post will share the results from one client that executed a proof of concept test to compare the two search engines.
Proof of Concept Approach – As we have mentioned before, many clients have decided to implement an external cache outside of Documentum to address business continuity, performance and licensing issues. For a large pharmaceutical client, TSG was tasked with performing a proof of concept on 156,000 documents in an external data source indexed by Lucene. The proof of concept would compare search results of FAST within Documentum (Webtop) and Lucene (HPI) outside of Documentum in regards to search results. The proof of concept additionally evaluated leveraging Lucene for metadata storage rather than storing in another database such as Oracle.
POC Findings – Lucene/HPI and the external repository was found to be considerably quicker that the existing FAST/Webtop implementation on most queries.
Specific results:
|
Query |
FAST/Webtop |
Lucene/HPI |
| 1200 Results | 90 seconds | 3 seconds |
| 8 Results | 5 seconds | 3 seconds |
| 10 Results | 8 seconds | 4 seconds |
| 76 Results | 10 seconds | 5 seconds |
| 5100 Results | 72 seconds | 5 seconds |
| 65 Results | 6 seconds | 3 seconds |
Simple configuration of the Lucene index did a better job of returning a more complete search result set than the standard FAST/webtop configuration. Examples included additional documents that were logical derivatives of the initial search word. For example – a search for “exception report” could return “exceptions report” or “exception reports”. The proof of concept data set also included German documents and Lucene demonstrated multilingual stemming capability.
Key Stats – Lucene
- 156,000 Documents – 31.6 Gigabytes
- Total Index Space – 521 MB
- Total Index Build Time – 10 hours – The client was very interested in the time it took to index the content and metadata in Lucene because they had experience lengthy indexing times with FAST in their 5.3 upgrade. This was tracked as part of the proof of concept, however, the corresponding FAST data is no longer available from the 5.3 upgrade.
FAST and Lucene – Full Text Syntax Differences
- FAST
- “One Two” – will return documents with the exact phrase “One Two” in the document
- One Two – will return documents with the words One OR Two in the document
- One+Two – will return documents with the words One OR Two in the document
- One and Two – will return documents with the words One AND Two in the documen
- Lucene – Based on the Proof of Concept’s configuration
- “One Two” – will return documents with the exact phrase “One Two” in the document
- One Two – will return documents with the words One AND Two in the document
- One OR Two – will return documents with the words One OR Two in the document
- One and Two – will return documents with the words One AND Two in the document
- One+Two – will return documents with the exact phrase “One Two” in the document
Overall Thoughts
Overall the client was very satisfied with the findings and is moving forward with the solution. The flexibility of Lucene to index both the metdata and full-text values allowed the client to avoid adding an additional Oracle database to their external cache for attribute storage. The client also liked the more simple, intuitive search interface of HPI compared to the Webtop interface.
In addition to leveraging Lucene for searching an external cache, we are also working to leverage Lucene for internal Documentum/Webtop search.
If you have any questions or would like more detailed information, please contact us or comment below:
Thin Client Annotation tool uses Google Web Toolkit
Published November 9, 2009 Open Source , OpenContent , R&D , Tech Tip Leave a CommentAs part of our design and development for our Thin-Client annotation tool, we decided to combine the Google Web Toolkit (GWT) and Flex as the main implementation technologies. GWT is a relatively new set of open source tools that allows you to create and maintain front-end JavaScript applications in Java. This means that the front end code is written in Java that is compiled by GWT into optimized JavaScript that works across all major browsers. Eliminating handwritten JavaScript can greatly simplify front-end coding. However, if the application needs to do something that GWT cannot, JavaScript can be inserted straight into the Java program. A number of widget libraries are available from Google and third parties but if none suit a specific needs (as was the case with our sticky notes); it is quite simple to create a custom widget.
Another advantage of GWT is that it allows debugging in a hosted mode browser, so most changes in the client side code can be viewed by simply refreshing the browser. Several plug-ins are available which allow GWT development in different development environments including Eclipse.
GWT’s layout concept can be the main learning curve for developers unfamiliar with GWT. Most of the page layout is based on the placement of horizontal and vertical panels. All widgets in a horizontal panel will appear on the page lined up horizontally. If the user wants one of the widgets to be below the rest, that widget will need to be placed in a vertical panel with the previously mentioned horizontal panel. Once the layout concept is grasped, it quickly becomes quite intuitive but it was frustrating at first. The only other issues we ran into came from the fact that GWT does so much internally (JavaScript compiling, RPC calls, etc), this can make debugging more difficult since it can be hard to identify what the actual root of a problem is.
Overall, GWT was a great tool for front-end JAVA development. It eliminated a lot of time that would have to be devoted to JavaScript development. There are some really helpful tutorials on the Google Code website (http://code.google.com/webtoolkit/tutorials/1.6/index.html ) for anyone interested in learning more about development in GWT.
Here is a screenshot of our GWT Annotation tool interface. Be sure to check back often as we will be releasing our annotation demo shortly.

Google Web Toolkit Annotation Interface
Documentum Annotations – TSG View of the different solutions
Published October 8, 2009 Alfresco , Documentum , Open Source , R&D , Tech Tip Leave a CommentWith the upgrade to D6.5, many of our clients are reconsidering their annotation choices. This blog post will address some of the annotation product choices based on our experience, as well as our internal development efforts on our Free Viewer Tool that is based on a thin client with Adobe Flex and support for viewing and basic annotation capabilities.
Definition –this entry is referring to “annotation” as a mark-up “layer” on top of the document. Redline changes (like Word track changes) are embedded in the Word file and is not the focus of this entry.
Thick Client or Thin Client
One of the first decision points when choosing an annotation tool is between a thick or thin client. Early annotation tools required a client side component for client/server capabilities. With browser-based annotation tools, annotations might rely on either a client side plug in or an applet. For Documentum, client components are required for Brava (applet), Annodocs and Documentum Annotation Services (Adobe Acrobat). Snowbound offers versions that don’t require a client component or have an applet based approach. Our Free Viewer only requires Adobe Flash to be installed on the client. With a thin client approach, the image (not the entire file) is sent to the client. This could be a substantial performance improvement when viewing large files. Also, thin client approach provides for additional security since the file is never passed to the client.
TSG Thoughts – We are usually recommending the thin client to improve performance and security while reducing IT support costs particularly when extending the application to outside third parties.
Native Document Annotations or PDF-only
One approach would be to allow the mark-up layer to view on top of any type of file format. Snowbound and Brava both support this type of annotation. Another approach would be to turn everything into PDF and only allow mark-ups on top of the PDF. This approach is required by Adobe and Annodocs although supported by Brava and Snowbound as well.
TSG Thoughts – Many of our clients have had difficulty with the native document approach not due to fault of the vendor but due to the constantly evolving and backward compatible native file formats. For our free viewer, we are only supporting PDF or TIFF.
Annotation Capabilities
With all annotation tools, the amount of graphic options (circle, arrow, highlight, underscore….) can confuse the user and blur the line between annotations and redlines. Also, one major user complaint is that annotations can be buried on subsequent pages and users will have to flip to them to find them. Annotation tools should highlight/bookmark annotations when viewing the document to avoid having the user flip through every page looking for annotations.
TSG Thoughts – We lean toward simple annotations for basic markup to reduce training costs and markup/review time.
Upgrade/Changing Considerations
It is important to understand that every annotation tool typically stores it’s annotations in a proprietary format making it difficult to change annotation tools. When changing annotation tools, the existing annotations must be deleted or reformatted.
TSG Thoughts – For our Free Viewer, we have targeted Adobe’s new XFDF for mark-up to be compatible with Adobe as well as Documentum Annotation Services.
Documentum Upgrade – Inplace or Migration
Published September 29, 2009 D6 , D6.5 , Documentum , OpenMigrate , R&D , Upgrades 4 CommentsTags: Migration, Upgrade
For many Documentum customers, deciding how to upgrade a Documentum system often boils down to whether or not to upgrade in-place with a clone or just leave the environment alone and upgrade it in-place on the existing hardware. This year, I worked with a client on a project to explore the differences between upgrading several Documentum systems in place versus migrating the documents straight to a new 6.5 installation. Many of the in-place upgrade complexities were due to the older database and OS.
- Oracle needed to go from 9i to 10.2.03 as well as be converted to UTF-8
- The Unix OS needed a significant upgrade, including the rack supporting the virtual partitions
- The Documentum Content Server required several upgrade steps. It needed to go from 5.2.5 (some 5.2) to 5.2.5 SP5, then 5.3 SP6, and finally to 6.5. I then did a separate upgrade to 6.5 SP2.
There were several project goals that could only be achieved with a migration strategy.
- Combine Repositories on Windows installation and move to a single UNIX installation
- Reorganize object model by flattening object hierarchy
- Undo custom folder configurations created many years ago
The technical complexities of upgrading in-place from 5.2.5, and the need to merge Documentum repositories, led the client to pick a migration approach for the upgrade
Based on TSG’s upgrade experience with this client and others, we created an upgrade planning guide.
The planning guide is available here.
Please let me know your thoughts below.
Documentum Upgrade – High Volume Server – A Basic Understanding
Published September 22, 2009 D6.5 , Documentum , OpenMigrate , R&D , Upgrades 1 CommentTags: ECM, HVS, Migration
Documentum High Volume Server (HVS) is a new product designed to cut database space usage in Documentum 6.5 by a third or even up to one half depending on the type of content. Given the significantly reduced database size, overall performance should increase. This year TSG evaluated HVS for a client as part of a Documentum Upgrade. (See other thoughts in our Documentum Upgrade Planning Guide )
HVS – When to use it
Basically, HVS was developed to efficiently store non-changing static or immutable content and meta-data. A good example is scanning/imaging but COLD and other content/meta-data that will never change makes sense as well. Content stored using HVS should not need to be versioned, rendered, annotated or changed. Otherwise, HVS converts the object from a light weight object back to a normal Documentum object and the benefits of HVS are lost. Examples of content that are ideal for HVS include reports, invoices, check images, documents archived for historic purposes and reference, and emails.
HVS – How it works
HVS reduces the size of the database by sharing security and common meta-data amongst a set of lightweight objects. HVS can also partition the database to increase the rate content can be stored and retrieved. There are some limitations placed on the content to achieve these benefits. First, security is applied broadly to a lightweight object type. This results in all documents of a lightweight type being available to all users that can access the type even though a user may only need access to a portion of the documents. In other words, HVS cannot support the normal object-level ACL security and accordingly security may need to be built into the application layer. The other limitation, as already mentioned, is that documents cannot be versioned or changed.
If you need to make large volumes of content available in near real time, the rapid ingestion feature of HVS may be of interest. Using special HVS DFC functions, applications can load raw database tables that contain the meta-data information for your lightweight object types. This is very different than typical DFC applications that work strictly through the Documentum object layer. To use rapid ingestion, a custom program is necessary (Documentum does not have any tools that currently support this, including Captiva), the DBA will also need to partition the database tables. The partitioning allows the data to be loaded into “offline” Documentum tables. The tables are then swapped with empty place holder tables making the newly documents available while the Content Server stays up and running.
With a partitioned database, other new tricks are available in the HVS DFC to scope searches to particular database partitions. This can be handy if the system is very large and the user community is having unacceptable metadata search performance times.
WHERE TO GO NEXT
When considering HVS – users should keep in mind specific points
- Cost of HVS (will vary by installation)
- Performance Benefits versus normal database tuning
- Ingestion program development as this would be custom HVS DFC calls
In relation to the ingestion process, TSG has added support to HVS in OpenMigrate to help clients ingest new content as well move existing content to HVS. One benefit of this approach is that one tool can be used for ongoing ingestion of new content while also being able to support movement of existing content within the docbase (ex: archived items).
With our client, the proof of concept went well but the client didn’t quite realize that HVS required additional cost and licensing. In evaluating the benefits versus the cost, the database and Documentum support requirements did not outweigh the benefits and the client did not move forward with HVS.
TSG Labs – Customizing Documentum’s CenterStage Pro
Published August 13, 2009 Active Wizard , Documentum , R&D , Tech Tip Leave a CommentTSG recently had the privilege of test driving and customizing Documentum’s new collaboration solution, CenterStage Pro. CenterStage Pro is a next-generation collaboration tool and features a sleek new interface that does away with Documentum’s traditional WDK front ends.
To become familiar with the inner workings of CenterStage, we experimented with customizing the application to allow users to access our Active Wizard routing and approval tool. This is a common customization we see with many clients. The main difference between CenterStage and all previous Documentum web applications is it relies on JavaScript to do all of the work, including the implementation, as opposed to WDK’s use of Java on the backend. More specifically, CenterStage utilizes ExtJS. ExtJS is a cross-browser JavaScript library for building rich internet applications. The implementation of the actions that we created uses JavaScript objects, populated with Documentum properties for the current object, in order to pass the correct parameters into the Active Wizard.

Customization of CenterStage Pro Top Nav
We updated CenterStage with two very similar actions. The first action that we added launched the Active Wizard in the same window, automatically logging the user in. The user then had access to the entire Active Wizard, and when they completed their work, the “Return” link in the Active Wizard returned them to CenterStage, automatically logging them in and returning them to their last location. The second action that we added implemented all of the first action, with the addition of the ability to route a specific document using an Active Wizard form.

Customized File Actions Menu
The overall similarities with WDK in terms of the multiple files should be an advantage to anyone familiar with WDK; however while WDK splits up the different files (ie – NLS properties, action xml, etc) into individual files for each action, CenterStage tends to group all of the string definitions into one file, all of the action definitions into one file, etc. This centralizes the work quite a bit, as you are not creating a suite of new files for each additional action.
CenterStage surprised us with its overall ease of adding the simple actions that we set out to implement. The lack of documentation made the original development somewhat difficult, as we were forced to look through the existing code to see what kind of methods existed on the JavaScript objects that Documentum was using. However, once we were able to utilize a select group of methods to our advantage, putting together the implementation for our actions was relatively simple.
Since CenterStage is still version 1.0, it has a limited amount of customization capabilities. Documentum has stated that an official customization SDK should be available with version 1.5.
Be sure to check out a video of our customizations in TSG’s LearningZone.