J soup example for editing html

In static web pages some times we might need parse and edit the pages.

Simple example would be generating static html report we might want to make use of html parser or html editing libs.

Jsoup is one of the good create/editing libs that is available.

Below is simple example of editing the html file with Jsoup.

There are three simple steps in the process for the editing of htmls with Jsoup.

Step 1 : Parsing the html code.

Following is the example for parsing the html file with Jsoup. The document is type of html document.

Document reportDoc = Jsoup.parse(new File(filePath), "UTF-8");

Now that we have reportDoc as html document that we need to update, lets just say if html is containing below structure, problem statement is to add data to the following table in the existing html doc.

Step 2 : Update/Append or change the text in the doc

JSoup1

In general all html objects (I.E the elements that are inside the tags) are called as elements in Jsoup.

Steps to insert data would be to find the “Element” with table tag and append the element with a row i.e “tr” tag and then add the data to the table in td tags. We can insert all data at once or we can edit data by index of the tr and td as shown below.

Element ele = reportDoc.getElementsByTag("tbody").last();
ele.append("<tr class='styleincss'>
     <td style='text-align: center;'>1</td>
     <td style='text-align: left;'>desc</td>
     <td style='text-align: left;'>col3</td>
     <td style='text-align: left;'>col4</td>
     <td style='text-align: left;'>col5</td>
     <td style='text-align: center;'>col6</td>
     <td style='text-align: center;'>Complete</td>
    </tr>")
//This will generte the html
reportDoc.html()

//If we want to change the values of existing rows we can write
reportDoc.getElementsByTag("tr").last().getElementsByTag("td").get(4).text(data);

//If you want to change or add attribute
reportDoc.getElementsByTag("a").last().attr("style", "text-align: center;Color: " + colorCode);

 

In the above code we have seen ways to add inline html, editing dynamically with rows and columns, we can loop through as well and write some intelligent code to go to particular column and particular row and edit data.

We can also set attributes for example as color, or href for links shown in above example.

But we just appended the document, how do we save ?

Step 3 : Save and flush to disk

We can use Buffered-reader to read the html that J soup has just appended to the original document.

Lets just say you have file name that you want to create and also appended document, below is the code to save to disk.

public static void writeToFileAndFlushToDisk(Document doc, String outputFile) throws IOException {
BufferedWriter htmlWriter = new BufferedWriter(
new OutputStreamWriter(new FileOutputStream(outputFile), "UTF-8"));
htmlWriter.write(doc.html());
//Optional new line
htmlWriter.newLine();
htmlWriter.flush();
htmlWriter.close();
}

 

This is general idea on how to use J Soup for editing html files. I have used this and created simple html reporting for LeanFT tests. We can create files and dynamically update link references so that test results can be systematically generated.

Please write to me if you need more information.

Advertisement

Page Object Model – Framework

Most of the times, many project demands start of best fit automation framework from scratch. Page Object Model (POM) is most popular design pattern for the Selenium automation.

In this post, lets try to figure out the details. Before diving in its better to look at this post to get insight of what we are talking here.

Page Object Model – Approach

Lets us look at the following different packages for the purpose of the understanding.

pages contains the test pages for the site, where as each page contains the objects corresponding to that page.

In any UI testing we typically do two things with objects, either we retrieve properties for it or we perform actions on it.

eclipsstructure

In the start section we write necessary code to start the browser. Driver session creation, extent test creation, logger creation etc.

The tests section contains the test classes we are intended to write.

The utilities section contains the common actions class and the reusable action classes.

expandedpackages

Also, its better use config properties for the entire test setup along with the test data sheets. Also maintain drivers in different folder for better readability.

projectStructure

As shown above utilities package contains the excel manipulations such as get and set data operations.

Lets deep dive into start class and what it is doing.

Start class has all declarations and webdriver creation depending the config settings.

Create config file with browser type and other details so that we can use them in the code.

config

Start class will have the entire code for reading the config file and depending on the browser type mentioned in the config it creates the web driver session.

If we look at the declarations, the entire project is using extent reports as it will generate beautiful reports for the test execution.

Also, all other variables are declared as static where as driver is non static. The report logger is also static, since once we create start instance driver will be recreated but not the report logger. This will help to run the same test in multiple iterations keeping the same report.

If we look at the declarations, we are declaring the xldriver, extent report and logger properties as static as they dont need to be re initialized for loop execution. (Static variables are created once, they are not recreated for each instance of class)

start2

the launch_browser method returns the pagefactory object with created driver. This will flow to subsequent pages.

return.PNG

Will see why it is returning the page factory object.

If we look at the pagefactory class, it has definitions of all pages.

We are using pagefactory constructor to send the driver to all the sub subsequent classes.

And, the home and other pages will inherit this class so that they can get the latest driver created by the start class.

pagefactory

Each page class is contains the deification of elements and initialization of elements.

We will use PageFactory class (Selenium specific class) along with ajaxelemetlocator since it has advantage of creating element only when it is being used, we can also define the time duration for the locating element. We are also using reference class object for the common functions.

The reference, and the home class gets the driver created in the start class.

home1.PNG

Also the each class in the page returns the same page for better creation of tests.

home2

And finally coming to writing test cases.

Starting with method name, we are using the data sheet name as test method name.

So to avoid hard coding of the methodname passing to excel setExcelSheet method , we are programatically getting method name and assigning it.

Since xldriver and set_logger method is static, we dont have create start class object to use them.

And looking at the loop, we are getting rows in the excel sheet and running the required actions for each row. That means creating driver doing all actions and logging out.

We can restrict this for certain actions if we dont want to quit the browser, by keeping the creating of start object outside the for loop.

And by keeping (.) at the launch_browser(), we get access to the methods in pagefactory class, and since all other classes are extending pagefactory class, all of them gets access to pagefactory class methods. Thats the reason for writing page navigation methods in the pagefactory class.

This way we can write single line test cases, this will ease the reading of test case and writing and debugging them.

TestCase

the output for the test looks like below.

reports

This is gist of implement page object model with one sample testcase.

In this post we have not included the excel data manulations, hash maps and extent report creation.

We will try to address them in subseqnet posts.

Thank you.