Scriba XML, Cross-Media Publishing

Publishing Connections Incorporated is better known in the industry by its acronym: PCI. PCI is a privately held company which has been active in publishing for fourteen years. It’s a small company with offices in Washington, Denver, Montreal, Slovakia and China. PCI puts most of its energy in Scriba, a XML conversion and workflow system that treats cross-media publishing as a workflow diagram.

PCI gave me a demo of the system, which I think must be the easiest way to get content from a print document into a Content Management System (CMS). Scriba can be used as an intermediate agent between InDesign and any XML-capable CMS, but if you’re using Softcare’s K4, you can just use K4 and let the Scriba XML Server engine do its work completely transparently in the background. As from April 3, Scriba and Quark also announced a closer working relationship, further extending this system’s appeal in the cross-media publishing market.
With customers like Time, McGraw-Hill, CMP Media, and Forbes, PCI has little to prove. Its technology works with Adobe products, Softcare K4, Quark technologies, MarkLogic, and Alfresco. Scriba integrates with these products through “Connectors”. Such connectors are available for Adobe CS3, InDesign Server with or without K4 running on top of it, MarkLogic and Alfresco, SQL (JDBC needed), and QuarkXPress Server 7.

Scriba supports XML Workflow Management, and XML transformation. Scriba also acts as middleware, providing a bridge between production and data systems such as CMS, DAM, Editorial Workflow, etc.

A Scriba Connector in general allows for direct check-in and checkout and sometimes updating of content. It can also convert and extract XML from supported applications. The K4 Connector allows users to manipulate editorial content without the need for ongoing custom development and testing required by traditional scripting and XSL approaches. It leverages the functionality of the K4 XML Exporter to create a seamless conversion from InDesign and InCopy to XML formats for content aggregation, web workflows and ingestion into content repositories. I would say the Scriba K4 Connector is one of the most enticing modules a K4 user can buy.

Scriba supports industry standard technologies such as XML, XSL, UNICODE, REGEXP, XPath, DOM, etc. It also supports Web Services, HTTP, and Firewalls. There is no need to mount file systems or configure and maintain paths when using Scriba, and it runs on all major platforms through its Java-based architecture, including Windows, Macintosh, Linux and Unix (Solaris).
Scriba is sold on a per-server basis and as such, it was a bit difficult to set it up on my two old Power Macs (the G5 is still more or less fine, but an older G4/DP450MHz is way too old to run much anything these days.) Instead, PCI demoed the Scriba system during an interesting session that lasted a little under an hour.

I was very impressed with the concept of this system. Scriba is based on flow diagram style scripting, i.e. you drag XML “Actions”—rectangular blocks with snippets of XML code, called Rules—into the main window and connect these Rules with each other by dragging connection lines between the appropriate “docks” on the blocks. The whole system reminded me of the Quartz Effects’ programming tool every Mac OS X user can readily use once the Developers’ Tools have been installed.
XML Rules as Blocks in a Flow DiagramThe Rules all have limited functionality. It is connecting them together what creates the complete XML flow, and what makes Scriba so powerful and flexible. A set of Rules can be run to convert an InDesign file into a XML document and feed that document to an Alfresco CMS, for example. In InDesign you have a palette where you can designate the frames and text fragments that will make up for the XML structure.

There’s only one palette in there, although that one is too big to my liking, but then again, most palettes sit in my way, so it’s probably me who has a problem with them. The palette itself is there to designate the frames and see or enter the metadata as required, and to extract the XML using the Adobe APIs rather than Adobe’s built in XML export function. All the further functionality comes from the Scriba server and can be adapted to your needs by manipulating the Rules.

For example, to create XHTML—a web page—from an InDesign file, you would designate the frames with content to carry over in InDesign, create a flow that has the actions to create the XHTML file, and perhaps even the actions to upload the resulting file to the Web server, and run that action set. The result would be an automatic conversion of the InDesign file into a XHTML file. Needless to say, this whole process can be fully automated through the use of page geometry and style sheets.

The flow system allows for flexibility as well. You can change your mind after a flow has been set up, but you can also branch output. If you want to direct XML content to a page with pictures and the same content to a page without pictures, that’s only a few blocks and connector lines away. The possibilities are sheer endless.
PCI Scriba, XML as Flow DiagramThe only flaw that I could see was that you would be stuck if PCI doesn’t provide for an action that you alone may need. The answer of the VP of Operations at PCI was quick and dry: “If you have the enterprise server, you can program it yourself, but if you can’t or you don’t want to, we can program it for you.” Programming it yourself is what most large publishers probably would do, as the Rules themselves are made up of XML and Java.

At the end of the demo, I was truly baffled to see that such a conceptually simple system can offer such a level of power and flexibility.

On April 3, 2008, Scriba also became available to QuarkXpress and QuarkXPress Server users. On that date, Quark and PCI announced that Quark is going to leverage the Scriba XML suite of tools. Quark being eager to support publishers who are increasingly delivering dynamic content to a wide variety of media (see IT Enquirer’s cross-media report), they must be able to bridge disparate data systems and content formats to create a unified data stream with a minimum of manual intervention.

Quark recognizes they don’t currently have the necessary technology to enhance and repackage all that data to allow for analysis and delivery to a wide variety of media including print, PDF, web and mobile devices.

PCI’s focus on the development of conversion and XML workflow technology and Quark’s desire to capture a dominant position in this arena has driven these two companies into each other’s arms so to speak, and co-operate in future developments and distribution world-wide to publishers who want better content delivery applications.