Ask the Expert

Developing a parser for recognizing HTML tags

In my current project, I need to develop a parser for my language. It is actually not a language, but I am working on a project for which I have my own HTML tags and all my work will be based on those tags. Now, I need to find out if it is possible to generate a parser, which, when given a segment of code involving my tags, should recognize different tags in my language.

    Requires Free Membership to View

You say that you want to parse your own HTML tags. Are they HTML tags or tags you made up? Anyway, it doesn't really make a difference. You don't need to develop a parser. Grab any XML parser and feed it your tags. If you are using HTML or tags you invent yourself, you just need to make sure you follow the rules of XML.

In case you aren't familiar with XML, XML is the eXtensible Markup Language. The entire reason for being for XML is to allow you to create a set of tags that will be recognized by a parser. Go to for the definitive Web site on XML.

HTML is the HyperText Markup Language. xHTML is a version of HTML that conforms to the XML language requirements. HTML/xHTML are just a predefined set of tags. For more about the specifics of HTML/xHTML look here.

Now for some discussion directly relating to your question. XML is not a set of tags like HTML. XML allows you to define tags as needed by your application. The nicest thing about XML is that it allows you to do exactly what you're asking about.

You can define your own tags, or use HTML tags, and process them with a parser of your choice. A conforming XML parser already has the rules of XML built into it. There is no reason to write your own. I use the Oracle parser available at the Oracle Technology Web site.

There is also a free parser from Microsoft, MSXML.

You can also Google "XML PARSER" and get tons of hits.

I hope that answers your question. If you have any more, post away!

This was first published in November 2005

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: