Tuesday 28 August 2012

Regex in UAG to acheive 'not' is possible

I have been stuck with a few problems recently in UAG where I have wanted to use regex but in a negative form, eg 'anything except PDF and DOC'.

You have to do what is called a negative lookahead regex, which looks something like this:

 ^(?!.*(PDF|DOC).*).*

I am not going to try to explain this except to say that it allows any text that does not include PDF or DOC anywhere in the text - I use anywhere intentionally as I do not trust the URL not to include parameters and # bookmarks.  As I am no regex guru and there are plenty of sites better than this that can step you through how negative lookahead works I won't go character by character, but you could improve this by being more specific (at a minimum maybe \.PDF to force .PDF)

You can be more imaginative and use multiple elements and negatives - use a regex tester like regexpal online to verify what you are writing as it is much quicker than testing in UAG!

Where do you need regex? The places I most use it is in Appwrap/SecureRemote
 - in the search tags (add mode="regex" to the search tag)
and more importantly
 - in the pages to parse. 

In another post I talked about the 10Mb limit in Appwrap and I needed this to exclude binary objects.

No comments: