PHP, as befits a modern programming language, offers the developer a set of functions for using regular expressions. You can search for occurrences of lines in other lines by complex criteria.
HTML, CSS, XML and other formalized files are classic tasks for applying the preg match all function. No less effective is the search for addresses, last names, phone numbers, e-mail and other information in informal texts.
Function Format
PHP offers two search functions: preg match and preg match all. The first one searches for the first occurrence of the pattern in the line, the second - all occurrences. The term "pattern matching" is sometimes used. In the first case, the result of the function is "the string matches the pattern", in the second case - "the string matches the pattern". Formally, the term “coincidence” more accurately reflects the essence, but the natural context of an operation is usually the “search” of information. In practice, one and the other are in demand. The function format is described below.
The result of the function is the number, the number of matches. All matches found are written to the - matches array. In the case of the preg match all function, you can specify the sort order of the array :
- PREG_PATTERN_ORDER;
- PREG_SET_ORDER.
Sorting according to the first variant groups the search results by the regular expression number (default value). In the second case, the results are grouped by their location in the row.
Symbol - template element
It is important to remember that the template operates with symbols. Programming has long forgotten what a character data type is. Modern languages do not fall below the concept of "string", but in relation to the template you need to understand: here they manipulate characters.
Building a template is, first of all, indicating the desired sequence of characters. If this is clearly understood, then there will be no errors in the template. In any case, it will be much less.
- And - this is a specific element of the template - a symbol.
- az is a template element, also one character, but only with a value from a to z - all Latin is lowercase.
- 0-9 is one digit, and any, but 1-3 is only 1, 2 or 3.
Case in the pattern is important. The first and last characters of the pattern are of great importance. You can specify where the template begins and how it ends.
Function template
PHP preg match all uses standard regex syntax. Square brackets indicate one of the characters that are indicated in them:
- [abc] only characters a, b, c.
- [^ ABC] all but the characters A, B, C.
- \ w and \ W are text or non-text characters.
- \ s and \ S are whitespace or non-whitespace characters.
- \ d and \ D are numbers or not numbers.
Repeat characters are indicated by braces - {n, m} and relate to the previous character.
- n denotes the repetition of "not less than";
- m - repetition of "no more."
The syntax provides many options for creating templates, but it is best to start with the basics, that is, with simple, personally written, in which complex elements and combinations are absent.
Simply put, by listing the real characters that are needed, indicating their desired quantities and taking into account that the symbol “^” corresponds to the beginning and “$” to the end of the line, you can create simple patterns. By analyzing real debugged regular expressions from qualified professionals, you can gain solid knowledge to create complex preg match all applications. The PHP arsenal is not limited to only these two functions, but they are most often used.
Simple practice
Pattern for an integer:
It’s also an integer pattern, but in front there can be a sign ("+", "-"), and there can be extra spaces in front / behind:
- / ^ [\ s | \ + | \ -] {0,1} [0-9] * /
Similarly:
- /^►\s|\+|\-†{0,1►►0-9.06.2012*(\.)►0-9-03*/ - a number with a dot.
- /►0-9a-z_-†+@►0-9a-z_^\.†+\.[az{{2,3►/ - option for recognizing e-mail.
Using your own templates for preg match all, their examples on the Internet, analyzing the code of website pages and other sources allow you to create your own template library.
There can be many options for finding information. In particular, the last two constructions presented can be modeled differently. In many cases, the preference will be given to the template that will more quickly and accurately provide the desired match. The use of preg match all in PHP, as well as similar functions in other languages, requires practice, attention and preliminary validation of the templates.