Do something for all physical pages in a project

A lot of batch programs I created work on all physical pages of a project. A physical page is a RedDot CMS page which would be published as a real .html file. If you would setup a filename on a physical page, RedDot CMS would generate a file with that name. Non physical pages therefore are all other pages, like container pages or row pages used within tables.

For statistical purposes I need to

  1. collect all physical pages below a given start page into memory and
  2. access all collected pages and write out page meta data (created, modified and other values).

Because RedDot CMS didn’t offer a build in flag, if a page is physical or not, I added an element with a predefined name (StandardField Text element named isPhysicalPage) only to content classes which instances are a physical page.

If you want to add such an element to all content classes you need to create a parameters page correlated with the utility class PhysicalPage.

The first parameter named isPhysicalPageTmpltElemName is the exactly name of the element which is used to identify a physical page (every page is checked by method contains() with that parameter value and treated as a physical page if the page has such an element.

You need to create such a structure in the project you need to configure the page ID of the rql_script instance in the configuration file com/hlcl/rql/util/as/projectPage_parmPageIds.properties for parameter com.hlcl.rql.hip.as.PhysicalPage.

The second parameter shadowElementsNameSuffix is used to speed up the crawling process. As the designer of the content classes you know below which List or Container elements never a physical page can appear in the tree.

Let’s assume such a List element is named text_list. Create a StandardField Text element named text_list_workflow_unlinked_flag (or your own suffix) to mark the list text_list as not to be scanned for below physical pages. The jRQL crawler will check for it and skip investigation of such a marked List or Container.

Now you have prepared your project and configured your parameters. Let’s proceed what you need in Java.

Because the collection of all physical pages can be quite huge (about 13.600 page at my side), the page details are not handled via the Java collections API. Instead I use a special in memory SQL database called hsqldb; you get the hsqldb.jar in the Eclipse project’s download. See how to instantiate the SQL database and prepare it for in memory usage.

Class.forName(“org.hsqldb.jdbcDriver”);
Connection connection = DriverManager.getConnection(“jdbc:hsqldb:mem:dbName”, “sa”, “”);
PhysicalPagesWalker walker = new PhysicalPagesWalker(connection, “tableName”);

The connection string is fix, except of the dbName at the end. The following parameters to get the connection are user name and password, let them unchanged, they are not used at all.

In last line the physical pages walker is created with the database connection and the name of the table which is created within the database to store all pages.

Now you need to implement what you want to do for every physical page found while the crawling runs. The walker can be configured with a PageAction object therefore.  The sole method is invoke(Page) which will get the physical page as a parameter for every physical page the walker identified. Create an own subclass of PageAction and implement within the invoke(Page) method what you need to do.

See the following example who uses a PageAction simulating the usage of the page within SmartEdit, what I used to fill the page cache after deleting it.

Page startPg = project.getPageById(startPageId);
PageAction simulateSmartEditUsage = new SimulateSmartEditUsagePageAction();
walker.walk(startPg, simulateSmartEditUsage);

All pages below the given startPage, the top page of your tree, are recursively scanned. If a physical page is identified, the given page action is called.

The walker checks automatically, if it has followed a page and all children already.  Think of a multi linked page below your start page. The walker will investigate this page only once, so in fact the PageAction is called only once for every physical page found.

With this walker/page action framework you are able to match a lot of requirement with a minimum of programming effort.

If you are interesting what the walker does you can configure it with an anonymus PageListener object, see the example class within the Eclipse project how to do it.

For all batch programs I recommend to use log4j as logging framework. With log4j I can easily write logfiles out of my batch programs and distribute e-mails if an error occur.

Combine a log4j debug() output with the walker’s listener and you get a logfile what pages the walker goes through.

Advertisements
Post a comment or leave a trackback: Trackback URL.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: