Developing a parser for recognizing HTML tags
In my current project, I need to develop a parser for my language. It is actually not a language, but I am working on a project for which I have my own HTML tags and all my work will be based on those tags. Now, I need to find out if it is possible to generate a parser, which, when given a segment of code involving my tags, should recognize different tags in my language.

    Requires Free Membership to View

    By submitting your registration information to SearchOracle.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchOracle.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

You say that you want to parse your own HTML tags. Are they HTML tags or tags you made up? Anyway, it doesn't really make a difference. You don't need to develop a parser. Grab any XML parser and feed it your tags. If you are using HTML or tags you invent yourself, you just need to make sure you follow the rules of XML.

In case you aren't familiar with XML, XML is the eXtensible Markup Language. The entire reason for being for XML is to allow you to create a set of tags that will be recognized by a parser. Go to http://www.w3.org/XML/ for the definitive Web site on XML.

HTML is the HyperText Markup Language. xHTML is a version of HTML that conforms to the XML language requirements. HTML/xHTML are just a predefined set of tags. For more about the specifics of HTML/xHTML look here.

Now for some discussion directly relating to your question. XML is not a set of tags like HTML. XML allows you to define tags as needed by your application. The nicest thing about XML is that it allows you to do exactly what you're asking about.

You can define your own tags, or use HTML tags, and process them with a parser of your choice. A conforming XML parser already has the rules of XML built into it. There is no reason to write your own. I use the Oracle parser available at the Oracle Technology Web site.

There is also a free parser from Microsoft, MSXML.

You can also Google "XML PARSER" and get tons of hits.

I hope that answers your question. If you have any more, post away!

This was first published in November 2005

Join the conversationComment

Share
Comments

    Results

    Contribute to the conversation

    All fields are required. Comments will appear at the bottom of the article.