Intelligence Beyond the Known Universe (PLM Strategies Column)31 Mar, 2007 By: Kenneth Wong
Enterprise search technologies redefine lifecycle data treatment.
Each quarter, when the editors of the Oxford English Dictionary release an update to their authoritative tome, they cautiously admit a handful of Valley jargon—that is, Silicon Valley jargon—into the repertoire of standard speech. Last June, Google was ushered in as a verb, along with text message (verb), uninstall (verb) and rewriteable (adjective). The legitimacy of Google as a noun, a reference to the search engine, is a given, but its emergence as a verb, as in "I Googled the latest news on IBM," has enormous significance. We began using the word as a verb because, over the years, the search engine has redefined how we search for and retrieve information. It marks the rise of a new data paradigm. We now expect information—from the price of a movie to the price of an automobile part—to be categorized, archived and displayed to us in a certain way—Google's way, for the most part.
This month, I venture beyond my usual coverage—the known PLM (product lifecycle management) universe—to look at a couple of search-technology suppliers that are reshaping the way we organize, inspect and reuse product and enterprise data. We're deliberately bypassing the usual suspects—Dassault, PTC and UGS—to concentrate instead on two firms that have made data visibility and access their core business. An exhaustive view of this market is simply beyond the scope of the column; this installment is but a casual glance at the possibilities supplied by enterprise search technologies and their importance in the product lifecycle, a factor that's often overlooked.
Google was brought to life by two Stanford students who'd set out to develop a better system for querying the World Wide Web. Endeca was conceived by two Princeton graduates looking for a better way to find a beer can on eBay.
In 1999, Steve Papa and Peter Bell logged on to the online auctioneer to buy an orange-and-black Princeton Reunion beer can with the emblem of their alma mater. After countless irrelevant search results, their frustration turned to inspiration. Here was an untapped market perhaps, a new business frontier created by information overload. Seven years later, Papa returned to Princeton to give a lecture titled, "Building a Pre-IPO Company in the Face of Recession, War and Google."
The German word entdecken, the root of the company's name, means "to discover." Similarly, the company's core product, the Endeca Information Access Platform, is supposed to let enterprises unearth business potentials buried beneath the information chaos. In analyst reports, Endeca often is listed as an enterprise search-engine vendor, but the broad categorization is somewhat misleading. Industry watcher Gartner places Endeca in the "Magic Quadrant for Information Access," a segment composed of "vendors with capabilities that go beyond enterprise search to encompass a collection of technologies, including search; content classification, categorization and clustering; fact and entity extraction; taxonomy creation and management, information presentation (for example, visualization) to support analysis and understanding . . . " ("Magic Quadrant for Information Access Technology, 2006," October 2006).
To digest and index the World Wide Web, Google's cofounder Larry Page created a crawler, which is a program that automatically scans and gathers information about the content of the Web pages and their relationships to one another. Endeca relies on a similar content acquisition system. "[The system] crawls unstructured content sources and ingests structured data," its literature states. "This includes relational databases, file servers, content management systems and enterprise systems such as enterprise resource planning (ERP), product lifecycle management (PLM), and master data management (MDM)."
The digested data is tagged by Endeca's MDEX engine, described as "a meta-relational index [that] holds detailed information about the text in data and content, the structure in databases and documents and the interrelationships among them." Endeca works with a variety of commercial data-management platforms, including IBM WebSphere, Dassault MatrixOne, PTC Windchill and SAP.
If you've ever shopped online at the Nike store, Home Depot, CDW or Wal-Mart, you may have already come in contact with Endeca's crawlers. The search boxes on these retailers' digital storefronts are powered by Endeca. Its business rivals include Fast Search & Transfer, based in Norway; Autonomy, dual-headquartered in San Francisco, California, and Cambridge, United Kingdom; and Google.
Parametric searches, commonly executed via checkboxes or drop-down menus, require a variety of input parameters from their users. If you're searching for a specific type of fixed capacitor, for example, you may need to provide its dimensions, maximum operating temperature or surface mount to narrow the results to a manageable list. Endeca, on the other hand, reduces the work to a simple search and navigation (or browse) interface. Behind the scenes, it uses stemming, wildcard, conceptual search and alphanumeric spelling corrections to peruse the product specifications from the PLM system and inventory information from the ERP system. The results are organized by specific attributes to provide what it calls a "guided navigation experience" (figure 1).
Figure 1. Endeca Information Access Platform groups search results by relevant context to make navigation easier.
Where does a turbocharged search engine fit in the product lifecycle? For new-product development and supply-chain management, it suggests potential savings in waste reduction through the ability to efficiently locate suitable parts for the projects. For sales, marketing and aftermarket support, especially in retail, it represents an easy mechanism for customers to locate a product within the desired price range, brand, style or store location.
Endeca pitches its search engine to manufacturers as a product-data navigation system, among other terms. The company also sells products for B2B commerce, intranet and knowledge management and supply-chain analytics based on its IAP (information access platform). The latest release, Endeca IAP 5.0, was launched in November 2006. Implementation cost ranges from $150,000 to multiple millions, depending on information size, user needs and project scope.
In San Jose, California, roughly 20 miles away from the garage in Menlo Park that incubated Google's humble origin, Chris Groves runs Centric Software, an enterprise software company. Groves, a Harvard graduate and dot.com era survivor, helped secure funding that led to the company's launch in 1988. Centric's motto used to be open PLM, a concept based on driving intelligent decision making by connecting disparate enterprise systems. After its January 2006 acquisition of Bellevue, Washington-based Product Sight, Centric began rebranding itself as a product intelligence company.
From this transaction, Centric netted three major products:
- FindView, which searches and extracts content information from Autodesk AutoCAD and Inventor, SolidWorks, PTC Pro/ENGINEER and Pro/E Wildfire and UGS NX Unigraphics and Solid Edge documents;
- Product Data Alchemist, which automatically identifies critical product-data assets within the corporate networks and facilitates knowledge mining, cleanup and PLM integration; and
- Product Sight Lifecycle Extensions, or Search-Powered PLM, which offers a single-point Web access to lifecycle data from real-world engineering, manufacturing and sales and service activities.
Product Sight's technology DNA can be found in Centric InSight 6.0, released in October 2006 (figure 2). The software suite is described as "an off-the-shelf product intelligent search application that discovers and classifies all structured and unstructured product information stored behind the firewall in disparate, enterprise-wide systems."
Figure 2. Centric InSight presents search results with related attributes and usage instances extracted from ERP and PLM systems.
Centric sees clusters of product attributes scattered all over networks—virtual vaults that house thousands of office documents, e-mails, CAD drawings, BOMs (bills of materials), warehouse inventories, accounting systems and so on—as islands of information. The idea is to keep the data where it resides; in other words, it avoids massive (or messy) data migration from one island to another. Therefore, Centric uses connectors, which are programs for extracting the most updated information from relevant fields to bridge those digital islands.
According to Centric, the connectors "automate the data collection and roll up of live and current data from MCAD (mechanical computer-aided design), EDA (electronic design automation), CAE (computer-aided engineering), PDM (product data management), ERP (enterprise resource planning), SCM (supply-chain management) and document-management systems in a real-time, secure, Web-based environment; present information in meaningful new combinations for dispersed team coordination and decision making . . . "
One Search Does Not Fit All
"To be specific, Google does not understand or index the content of a title block in a CAD drawing, nor can it discover the relationships in the results, much less categorize the results," stated Centric in a white paper titled "Find Product and Project Content Just in Time."
One of Google's innovations is in recognizing not only the content of a Web page but also the other pages that reference the target page. Similarly, a CAD-compatible search engine offers greater value to users if it's able to take into account not only a part's attributes but also where the item has been used—something accommodated by Centric Insight with the "Locate All Where-Used" feature.
Centric InSight can be purchased in a configuration with as few as 25 users. According to the company, more than 40,000 users currently managing more than $10 billion in new products and projects are using the application.
Discovery is Only the Beginning
Better search results alone won't help you manage your business or shepherd your projects, but the ability to view your product data in an aggregated visual environment might. To this end, Centric offers the Centric Decision Center, a virtual boardroom to manage product launches and projects. The company calls it "a key component of [its] Product Intelligence solutions, which allow you to collect live, current, data to facilitate informed team collaboration with sharing and review of ideas, activities and deliverables; mine, merge and automate data collection from multiple source systems; . . . continually assess progress and align execution with business goals."
Centric's president and CEO Groves uses Google for his general-purpose Web searches. And Endeca's founders Steve Papa and Peter Bell eventually managed to purchase the elusive beer can for $5.