Javascript, regexp: examples, validation of regular expressions

Before the advent of hypertext languages, but rather, until it became clear that you need not only to search, but also to do it under certain conditions, in a specific place, with changed data, in the right quantities, the usual search and replace functions would suit any sophisticated the programmer. Masterpieces of search art in programming languages ​​were created, and databases were refined in the form of sampling conditions, equipped with stored procedures, triggers, and other means of selecting from bulky relational information compositions. The appearance of regular expressions did not lead to a revolution, but it turned out to be a useful and convenient tool for searching and replacing information. For example, regular JavaScript email expressions greatly simplify the registration of visitors, do not load the site by sending messages to non-existent addresses.

It is impossible to say that regular expression in JavaScript is much better than thought-out sequences of indexOf () calls framed by conditional and cyclic operators, but it can be stated unambiguously that it made the script code compact, but poorly understood by the uninitiated.

RegExp object = template + engine

Regular expressions are a pattern + an engine. The first is the regular expression itself - the JavaScript object - RegExp, the second is the template executor applying it to the string. The engines that implement regular expressions for each programming language are different. And although not all differences are significant, this must be borne in mind, as well as it is imperative to carefully check the regular expression before using it.

javascript regex

The special notation when writing regular expressions is quite convenient and quite effective, but it requires attention, accuracy and patience from the developer. You need to get used to the regular expression pattern notation. This is not a tribute to fashion, it is the logic of the implementation of the JavaScript regular expressions mechanism.

Regex pattern

Two options are allowed:

var expOne = / abc * / i;

var expTwo = RegExp (“abc *”, “i”);

The first method is usually used. In the second case, quotation marks are used, therefore, to use the '\' character, it must be escaped according to general rules.

'i' is a flag indicating "case is not important." You can also use the flags 'g' - "global search" and "m" - multi-line search.

The symbol '/' is commonly used to indicate a pattern.

Start and end of regex

The character '^' defines the character (s) with which the regular expression begins, and '$' defines which character (s) should be at the end. You should not experiment with them inside the expression, there they have a different meaning.

For instance,

var eRegExp = new RegExp (cRegExp, 'i');

var cRegRes = '';

var sTest = 'AbcZ';

if (eRegExp.test (sTest)) {

cRegRes + = '- Yes';

} else {

cRegRes + = '- No';

}

var dTestLine = document.getElementById ('scTestLine');

dTestLine.innerHTML = 'Expression /' + cRegExp + '/ for string' '+ sTest +' "'+ cRegRes.

The element 'scTestLine' will have the result (the variable cRegExp has the corresponding value):

expression / ^ AbcZ $ / for string "abcz" - Yes

If you remove the flag 'i', the result will be:

expression / ^ AbcZ $ / for string "abcz" - No

Regular expression content

A regular expression is a sequence of characters that is the subject of a search. The expression / qwerty / looks for the occurrence of this particular sequence:

expression / qwerty / for the string "qwerty" - Yes

expression / qwerty / for string "123qwerty456" - Yes

The character '^' changes the essence of the expression:

expression / ^ qwerty / for string "123qwerty456" - No

expression / ^ qwerty / for string "qwerty456" - Yes

Similarly for the line terminator. Regular expressions allow sequences: for example, [az], [AZ], [0-9] - all letters of the Latin alphabet in the specified register or numbers. Russian letters can also be used, but you should pay attention to the encoding of strings (where it is searched, what is searched) and the page. Often Russian letters, as well as special characters, are preferably set with codes.

When forming a regular expression, you can specify options for the presence of certain characters in a particular place, while their number is specified as follows: '*' = repeat 0 or more times; '+' = repeat 1 or more times; {1,} same as '+'; {n} = repeat exactly n times; {n,} = repeating n or more times; {n, m} = repeat n to m times.

Using square brackets, you can specify the options for the character from the set. It looks like this. [abcd] = [ad] = any character from four: 'a', 'b', 'c' or 'd'. You can specify the opposite. Any character other than those specified in the set: [^ abcd] = any character other than 'a', 'b', 'c' or 'd'. '?' indicates that there may not be a symbol in a given place. '.' defines any character other than a line feed. This is '\ n', '\ r', '\ u2028' or '\ u2029'. The expression '\ s * | \ S *' = '[\ s | \ S] *' means to search for any character, including line breaks.

Simplified regex options

The expression '[\ s | \ S] *' is a search for a space or its absence, that is, everything that is on the line. In this case, the designation '\ s' means a space, and '\ S' means its absence.

Similarly, you can use '\ d' to search for a decimal digit, and '\ D' will find a non-digit character. The designations '\ f', 'r' and '\ n' correspond to form-feed, carriage return and line-feed.

The tab character is '\ t', the vertical character is '\ v'. The designation '\ w' will find any character of the Latin alphabet (letters, numbers, underscore) = [A-Za-z0-9_].

The notation '\ W' is equivalent to [^ A-Za-z0-9_]. This means any character that is not a letter of the Latin alphabet, a digit or the sign '_'.

Search for character '\ 0' = search for NUL character. Search '\ xHH' or '\ uHHHH' = search for a character with the code HH or HHHH respectively. H is the hexadecimal digit.

Recommended Regular Expression Wording and Encoding

Any regular expression is important to carefully test on different variants of strings.

javascript regular expressions

With the experience of creating regular expressions, there will be less errors, but nevertheless one should always keep in mind that one's own knowledge of the rules for writing regular expressions may not correspond to reality, especially when the “regular” is transferred from one language to another.

Choosing between the classics (exact indication) and the simplified version of the regular expression, it is better to prefer the first. Indeed, in the classics it is always clearly indicated what and how is sought. If there are Russian letters in the regular expression or in the search string, you should lead to a single encoding of all the lines and the page on which the JavaScript code that runs the regular expression functions.

When processing characters that do not belong to the Latin alphabet, it makes sense to consider the indication of the character codes, and not the characters themselves.

When implementing JavaScript search algorithms, the regular expression should be carefully checked. It is especially important to control character encoding.

Brackets in regular expressions

The square brackets indicate the variants of the symbol, which should be present or absent in a certain place, and the round brackets indicate variants of the sequences. But this is only a general rule. There are no exceptions to it, but there are many diverse applications.

var cRegExp = "[az] *. (png | jpg | gif)";

var eRegExp = new RegExp (cRegExp, 'i');

var cRegRes = '';

var sTest = 'picture.jpg';

if (eRegExp.test (sTest)) {

cRegRes + = '- Yes';

} else {

cRegRes + = '- No';

}

Results:

expression /►az.BIZ*.(png|jpg|gif)/ for the string "picture.jpg" - Yes

expression /^►adapter[az.BIZ*.(png|jpg|gif)/ for the string "picture.jpg" - No

expression /^►adapter[az.BIZ*.(png|jpg|gif)/ for the string "apicture.jpg" - Yes

expression /^►adapter[az.BIZ*.(png|jpg|gif)/ for the string "apicture.jg" - No

It should be especially noted that everything after which there is an asterisk can be present zero times. This means that the "regular" can work in the most unexpected way, at least.

javascript regular expressions examples

RegExp Validation - Email Testing

In JavaScript, regular methods get two methods, test and exec, and can be used in String objects in their methods (functions): search, split, replace, and match.

The test method has already been demonstrated, it allows you to check the correctness of the regular expression. Method result: true / false.

Consider the following JavaScript regular expressions. Check email from the number of "difficult, but for sure":

var eRegExp = /^(([►<>()\[\\\\.,;:\s@"{+(\.[^<>()\\\\\\.,;:\s @ "] +) *) | (". + ")) @ ((\ [[0-9] {1,3} \. [0-9] {1,3} \. [0-9] { 1,3} \. [0-9] {1,3}]) | (([[a-zA-Z \ -0-9] + \.) + [A-zA-Z] {2,}) ) $ /;

for the string var sTest ='SlavaChip@sci.by 'gives true, that is, this string is the correct email address. The check was carried out using the eRegExp.test (sTest) method.

Practical use: e-mail processing

The exec method on the output provides an array, a call:

var aResult = eRegExp.exec (sTest);

cRegRes = '<br/>' + aResult.length + '<br/>';
for (var i = 0; i <aResult.length; i ++) {
cRegRes + = aResult [i] + '<br/>';
}

gives the following result:

9
Slava.Chip@sci.by
Slava.Chip
Slava.Chip
.Chip
undefined
sci.by
undefined
sci.by
sci.

Other methods work similarly. It is recommended that you check them yourself. It is advisable to work out the development and use of regular expressions in practice; copying the code is not always advisable here.

Popular "regulars"

The above JavaScript regular expression for eMail is not the only one; there are many simpler options. For example, /^►\w-\.†+@►\w-†+\.[az{{2,3}$/i. However, this option does not take into account all the options for recording an email address.

Of course, you need to look at the experience of colleagues, analyze the methods they offer, before designing your own regular expression in JavaScript. But there are certain difficulties. Do not forget that in JavaScript, regular expressions (examples of them when copying) can duplicate significant characters: '\', '/' or quotation marks. This will result in an error that can be looked up for a long time.

It is important to consider the familiar “human aspect”. After all, formal JavaScript regular expression for a phone that can be a visitor (person) can be specified in various ways: 123-45-67, (29) 1234567, 80291234567 or +375291234567. And this is all the same number. The option of writing multiple patterns is not always acceptable, and rigidly fixing the rules for writing a number can create unnecessary inconvenience or restrictions. The option / ^ \ d [\ d \ (\) \ -] {4,14} \ d $ / i is suitable for most cases of checking the phone number.

If you need to compose JavaScript regular expressions, only checking numbers, then even such a simple case requires clarification. He must consider an integer or a fractional, exponential notation or a regular, positive or negative number. You can also take into account the presence of a currency symbol, the number of digits after the decimal point, and the division of the integer part of the number into triads.

The expression / ^ \ d + $ / i will only check numbers, and the expression /^\d+\.\d+$/i allows you to use a period to indicate the fractional part of a number.

In JavaScript, regular expression checking can be used to hard-code the format of the input data, which is relevant, in particular, when entering questionnaires, passport data, legal addresses, etc.

Checking the date is just about complicated

javascript regular expression for email

Let's look at JavaScript regular expressions. Examples for a date, as well as for a number or phone number, are a choice between stiffness and flexibility. The date of the event is one of the essential data that often has to be entered. But fixing the input in a specific format: 'dd-mm-yyyy' or 'dm.yy' often leads to customer dissatisfaction. The transition from the input field of the day to the month, filled with the classic HTML-form, may not take place when entering only one digit, and entering the second can cause difficulties. For example, 3 has already been entered in the field of the day, and the next digit 2 does not replace the first, and 32 is assigned to it, which, of course, will cause inconvenience.

Efficiency and convenience of regular expressions significantly depend on the general construction of the dialogue with the visitor. In one case, to indicate the date, it is advisable to use one form input field, in another case, it is necessary to provide different fields for the day, month and year. But then there will be additional “code costs” for checking a leap year, the number of months, the number of days in them.

javascript replace regular expressions

Replace search, regex memory

JavaScript replace (regular expressions) uses the method of the String object and allows you to find the value and immediately change it. This is convenient for correcting input errors, editing the contents of form fields, and for converting data from one presentation format to another.

var cRegExp = / ([ay] +) \ s ([ay] +) \ s ([ay] +) / i; // search creates three 'variables'

var sTest = 'this article is good!';
var cRegRes = sTest.replace (cRegExp, "$ 2, $ 3, $ 1");

var dTestLine = document.getElementById ('scTestLine');

dTestLine.innerHTML = 'Expression' + cRegExp + 'for the string "' + sTest + '" will get:' + cRegRes;

Result:

expression / ([ay] +) \ s ([ay] +) \ s ([ay] +) / i for the line "this article is good!" it will turn out: an article, good, this one!

When executed, each pair of parentheses remembers the result in a 'variable' $ n, where n is the number of a pair of parentheses ($ 1, $ 2, ...). In contrast to the generally accepted, here the numbering of variables is carried out from 1, but not from 0.

javascript regular expressions email validation

General recommendations

Regular expression simplifies the code, but the time it takes to develop it often matters. You can start with simple constructions, then combine what is done into more complex expressions. You can use various online services for checking regular expressions or special local tools.

javascript regular expressions only numbers

The best option is to create your own library of regular expressions and your own tool to test new developments. This is the best way to consolidate your experience and learn how to quickly create reliable and convenient designs.

Using repetition of characters and lines, that is, special characters '*', '+' and curly brackets indicating the number of repetitions, one should be guided by the principles of simplicity and expediency. It is important to understand that the regular expression from the moment it starts to work until the result is completely in the power of the browser engine used. Not all JavaScript languages ​​are equivalent. Each browser can bring its own personal preferences in the interpretation of regular expressions.

Compatibility does not only apply to pages and stylesheets, it is also relevant to regular expressions. A page using JavaScript can only be considered debugged when it has successfully worked on various browsers.

JavaScript, String and RegExp

By right, work at the client level, that is, in the visitor’s browser in the JavaScript language, requires high qualifications from the developer. A long time ago it became possible to debug JavaScript code using your own browser tools or using third-party extensions, code editors, independent programs.

However, far from all cases, the debugger can manage and provide the developer with good support, quick error detection, and bottleneck detection. The days when the computer was oriented toward computing in the distant past. Now they pay special attention to information, and line objects began to play a significant role. Numbers have become strings, and they reveal their true nature only at the right time and in the right place.

Regular expressions enhance the capabilities of strings, but require due respect. Debugging RegExp during its operation, even if it is possible to simulate it, is not a very interesting undertaking.

Understanding the structure and logic of the RegExp object, the meaning of the String object, the syntax and semantics of JavaScript is a sure guarantee of safe and reliable code, stable operation of each page and the site as a whole.

Source: https://habr.com/ru/post/K3715/


All Articles