There are lots of articles, which describe how to convert XML documents to the documents of other types using the XSL Formatting Objects. Most of them describe how to obtain HTML from XML and only few tell about PDF documents.

Such articles usually demonstrate simple examples (how to build the table or perform the text output). But sometimes we need to create representative reports or documents. In this case, developers face nontrivial problems – for example creating the table of contents (using internal or external links), bookmark trees, picture galleries, etc.  This article will help you to examine main features of XSL schemes.

Contents:
1. Brief survey of the topic
2. С# example
3. XSL Features
3.1 Brief survey of main XSL operators and constructions
3.2 Usage of Cyrillic and East Asian fonts
3.3 Building the bookmarks tree
3.4 Building the table of contents
3.5 Page numbering
3.6 Logo insertion
3.7 Hyperlink insertion
3.8 Table rotation
3.9 Gallery
4. Conclusion

Brief survey of the topic

Generation of PDF documents is based on the Apache XML FOP technology that was initially written in Java language. But in our case, we use nFOP – the C# wrapper that is based on the Visual J#. This wrapper makes it easier to write pure .NET reporting modules. The generation of PDF documents is very simple and can be described in the following way:

We have an XML document (created by some application or manually) and  XSLT scheme, with the help of which we obtain the XSL-FO document. The PDF file will be created on its basis. Then we compile the obtained XSL-FO document to PDF with the help of FOP.

С# example

The process of PDF document generation can be described using a small example. Supposing, there is a list of writers, who have their books and these books consist of articles, respectively:

This structure is represented by Author and Book classes.

>public class Author
{
public List<Book> created;
public string name;
public int id;
...
}
public class Book
{
public int id;
public Props bookProps;
public List<Pare> articles;
...
}
{/code}

C# language allows you to serialize any class to XML document. In this case, we can provide the following example:

    //Serializing data
    FileStream fs = System.IO.File.Create("source.xml");
    XmlSerializer s = new XmlSerializer(typeof(List));

    List<Author> list = new List<Author>();
    list.Add(writer);
list.Add(writer1); s.Serialize(fs, list); fs.Close();

Then, according to the scheme described above, the process of XLS-FO document generation is performed:

    //Generate FO file
    XslTransform xslt = new XslTransform();
    //Loading XSL template
    xslt.Load("schema.xsl");
    //Loading XSL template
    xslt.Transform("source.xml", "source.fo");
    

The last step is to call the methods of nfop, which compiles XLS-FO to PDF:

    //Generate PDF file
    java.io.FileInputStream streamFO = null;
    java.io.FileOutputStream streamOut = null;
    try
    {
        streamFO = new java.io.FileInputStream("source.fo");
        streamOut = new java.io.FileOutputStream(fileName + ".pdf");
        InputSource src = new InputSource(streamFO);
        Driver driver = new Driver(src, streamOut);
        driver.setRenderer(Driver.RENDER_PDF);
        driver.run();
    }
    catch(FOPException ex)
    {
        Console.WriteLine(ex.Message);
        throw;
    }
    catch (System.Exception ex)
    {
        Console.WriteLine(ex.Message);
        throw;
    }
    finally
    {
        if (streamOut != null)
        {
            streamOut.close();
        }
        if (streamFO != null)
        {
            streamFO.close();
        }
    }
    

It is important to include the following references to the solution being developed:

  • vjslib
  • nfop

vjslib.dll can be downloaded here.

nfop.dll can be downloaded here.

XSL Features

As it was mentioned above, the majority of articles about XSL-FO describe such common things as the output of the formatted text or the creation of tables. But often you need to add to the document:

  • Table of contents with internal references to articles (taking into account that we do not know the page numbers beforehand as the document is generated dynamically)
  • The bookmark tree (rather convenient tool for navigating in the document)
  • Footnotes with definite blocks of the text and, perhaps, references to external web pages
  • The picture gallery (it can be also dynamically saved by means of the C# language).

  1. The bookmark tree.
  2. Table of contents with internal references to chapters of the document.
  3. Page numbering.
  4. Static picture (can be the company logo).

Brief survey of main XSL operators and constructions

XSL language includes all main constructions:

  • if analog:
    
    
    
    
  • switch analog:
    
        
            
            
        
            
            
    
    
  • for analog (processes a number of XML nodes with the same level of nesting):
    
        
    
    

You can access the definite node by index (if you know its number in the list beforehand):

  

Also you can use some variable as the index:

    
        
    
    ...
    
    

number($index) conversion returns the number, which was generated from the string stored in $index. If you do not use it, the $index variable will possess “1” as any  value as is considered to be of the string type by default.

If you use the position() operator it’s better to check also the condition:


    ...
    
    

Here, position() is the number of the current node of the XML document (last() is the last in the list, respectively).

Usage of Cyrillic and East Asian fonts

There is often a problem of using specific languages in reports. By default, they are not included in the list of nfop fonts. To add new languages, you have to add a new font to the useconfig.xml file. The configuratiob file will look like the following:

 
     
    
        
            
                
                
                
                
            
        
    
    

Here, metrics-file is a path to the font metric, embed-file is a path to the font file.

In the example above, the standard arialuni.ttf font was added. It is a part of Windows OS (WINDOWS\Fonts\ARIALUNI.TTF) and is the most universal one. Fonts should be placed in the same folder as the configuration file.

Building the bookmarks tree

When building the bookmark tree (with references to the articles), you should specify an additional id field in the nodes. With its help, references will be organized in the XSLT scheme.

We can create the static class, which generates unique id for each object that we need to refer on later:

    
    public static class IDGenerator
    {
        private static int id = 0;
        public static int Generate()
        {
            return id++;
        }
    }
    

Labels to refer on are placed when building the tables. We just add one more attribute to the current block, which will be used in references:

    
    
        
            
        
    
    

The process of building the tree is based on placing the blocks of such type:

  
    
        
            
        
        
            
            
        
    
    

They can be placed as the nested ones and it will create the treelike structure.

Building the table of contents

The table of contents of the document is created in the similar way.  The same id field (as in the previous example) is used for references:

  
    
        
            
        
        Chapter 1
        

        
            
                
            
        
    
    

The <fo:leader leader-pattern="dots"/> line fills space between the text and page number by dots.

 

Page numbering

One more interesting moment is that we do not know exactly on which page the table will be located while building the PDF document. Here we can use an expression that will define the page number of the dynamically generated document automatically:

  
    
        
            
        
    
    

Logo insertion

XSL Templates allow you to create the static content in any page region that will be repeated on each page. For example, the following code inserts the logo to the lower right corner of the document:

 
    
        
        
            
        
    
    

The building of the document frame can be organized in the similar way. You need to create four separate images for each border of the document (top, bottom, left and right), which can be then used as the background of the region  (region-after, region-before, region-end, region-start).

Hyperlink insertion

To insert external references, you can use the following construction:

  
    
        
        
        
            
                
            
            
        
    
    

Table rotation

There is often a problem of displaying the tables with numerous columns. Such tables cannot be placed within the page borders.

Name

Prop1

Prop2

Prop3

PropN

1

Some data

Some data

Some data

Some data

2

Some data

Some data

Some data

Some data

In this case, it is possible to perform table rotation, where a new table will correspond to each row:

Table 1 – first row

Name

1

Prop1

Some data

Prop2

Some data

Prop3

Some data

PropN

Some data

Table 2 – second row

Name

2

Prop1

Some data

Prop2

Some data

Prop3

Some data

PropN

Some data

Sometimes there is a nonstandard situation when it is necessary to process the data, which is stored in two related lists of nodes of the XML document. For example, you need to fill in the table where the first column Columns contains the name of the data, the second one - Rows - contains its values.

 
    
        
Name



        
        
Phone Number



        
    
    
        
            
                Johny
            
            
                555-55-55
            
        
        
            
                
                Ann
            
            
                
                333-60-00
            
        
    

To create the table from the given XML document, you can use the following pattern:

  
    
        
            
                
                
                
                    
                        
                            
                                
                                    
                                        
                                    
                                
                            
                            
                                
                                    
                                        
                                    
                                
                            
                        
                    
                
            
        
    

Gallery

The process of creating the picture gallery can be described in the following way:

  • The required pictures are saved to the temporary folder on the hard drive (programmatically).
  • The XML document is created. It contains the full paths and the description of the saved files (this data will be used later in the XSLT scheme).
  • The PDF document is generated. Pictures from the temporary folder are built in it.
  • The temporary folder with its contents is deleted.
    
        
            
                Name
                Blue hills
            
            
                Path
                E:/pdf_root/Blue hills.jpg
            
            
                Size
                28 521 bytes
            
            
                Type
                JPG File
            
            
                Created
                12 may 2010, 15:01:36
            
        
        ...
    
    

The pattern, which builds the gallery, is given below:


    Gallery


 
   
   
     
       
        
          
         
         
         
             
               
              
                  
                        
                               
                                
                               
                           
                          
                      
                      
                          
                           
                           
                           
                               
                                
                                 
                                     
                                      
                                     
                                 
                                 
                                     
                                      
                                     
                                 
                                
                               
                           
                          
                      
                     
                 
                
               
           
          
      
     
 

    

As a result, we receive the following table:

Conclusion

In this article, we introduced a brief description of creating the PDF report using the existing data in the program. Also we gave a brief survey of main constructions and operators of the XSL language. There are a lot of simple but rather nonstandard situations when we need to transpose tables, move between levels in the XML document, and create picture galleries.

For additional information, see the following links:

http://www.renderx.com/tutorial.html
http://www.w3schools.com/xslfo/
http://www.ecrion.com/Support/PDF/XSL-FOTutorial.pdf

 

Download example source (18,1KB)

Subscribe to updates