YuPcre2 1.7.0 D7-XE10.2 » Developer.Team

YuPcre2 1.7.0 D7-XE10.2

YuPcre2 1.7.0 D7-XE10.2
YuPcre2 1.7.0 D7-XE10.2 | 5 Mb


YuPcre2 is a library of Delphi components and procedures that implement regular expression pattern matching using the same syntax and semantics as Perl, with just a few differences. There are two matching algorithms, the standard Perl and alternative DFA algorithm:

The Perl algorithm is what you are used to from Perl and j@vascript. It is fast and supports the complete pattern syntax. You will likely be using it most of the time.
DFA is a special purpose algorithm. If finds all possible matches and, in particular, it finds the longest. It never backtracks and supports partial matching better, in particular multi-segment matching of very long subject strings.

YuPcre2 has native interfaces for 8-bit, 16-bit, and 32-bit strings. Component wrappers are available for UnicodeString / WideString and AnsiString / Utf8String / RawBytestring:

The YuPcre2 RegEx2 classes descend from common ancestors which implement the core functionalities:

Match strings and and extract full or substring matches.
Search for regular expressions within streams and memory buffers. TDIRegExSearchStream descendants employ a buffered search within streams and files (of virtually unlimited size) and use little memory.
Replace full matches or partial substrings.
List full matches or partial substrings.
Format full matches or partial substrings by adding static or dynamic text.

Users familiar with the DIRegEx might be interessted in the differences between YuPcre2 and DIRegEx.

Pattern Syntax

YuPcre2 RegEx2 Workbench Application The YuPcre2 regular expression pattern syntax is mostly compatible with Perl. It includes the following:

Quoting
Escaped Characters
Character Types
General Category Properties for \p and \P
PCRE2 Special Category Properties for \p and \P
Script Names for \p and \P
Character Classes
Quantifiers
Anchors and Simple Assertions
Match Point Reset
Alternation
Capturing
Atomic Groups
Comment
Option Setting
Newline Convention
What \R Matches
Lookahead and Lookbehind Assertions
Backreferences
Subroutine References (possibly recursive)
Conditional Patterns
Backtracking Control
Callouts

YuPcre2 RegEx2 String Processing

YuPcre2 can Replace, List, or Format regular expressions matches or any of its substrings, useful for text editors and word processors. Variable portions of the match can be included into the result text. The full match can be referenced by number, substrings also by name. The character to introduce these reference is freely configurable. FormatOptions allow to turn features on or off as required.

Replace returns the original subject string with matches replaced, similar to but more flexible than Delphi's StringReplace() function.
List collects all string matches into a single string. It extracts multiple phone numbers, e-mail addresses, or URLs, with a single call.

YuPcre2 RegEx2 MaskControls

The YuPcre2 RegEx2 MaskControls Demo ApplicationYuPcre2 includes two regular expression mask edits: TDIRegEx2MaskEdit and TDIRegEx2ComboBox. Both controls validate keyboard input against a regular expression. They work similar to Delphi's TMaskEdit, but more flexible and powerful.

The regular expression mask edits can:

accept / reject specific characters at determined positions;
allow / reject particular characters if they follow defined character(s);
restrict input text to begin / end with exact character(s);
flag incomplete text to show that more input is needed.

Examples: Numbers, number ranges, dates, phone numbers, e-mail addresses, URLs, currency, and more.

Workbench Application

The YuPcre2 RegEx2 Workbench helps to design and test regular expressions. It allows to set options, measure execution times, and to save and load settings for later use.

The YuPcre2 RegEx2 Workbench is available as

Design-Time Component Editor and
Standalone Application.

YuPcre2 1.7.0
Implement PCRE2_ENDANCHORED, coEndAnchored, and moEndAnchored.
Add an explicit limit on the amount of heap used by pcre2_match, set by pcre2_set_heap_limit, TDIPerlRegEx2_8.HeapLimit, TDIDfaRegEx2_16.HeapLimit, and the pattern start (*LIMIT_HEAP=xxx).
Extend auto-anchoring etc. to ignore groups with a zero qualifier and single-branch conditions with a false condition (e.g. DEFINE) at the start of a branch. For example, (?(DEFINE)…)^A and (…){0}^B are now flagged as anchored.
Implement PCRE2_EXTENDED_MORE and coExtendedMore, and related /xx and (?xx) features.
Implement (?n: for PCRE2_NO_AUTO_CAPTURE and coNoAutoCapture, because Perl now has this.
Implement extra compile options in the compile context:
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES and coAllowSurrogateEscapes;
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL and coBadEscapeIsLiteral;
PCRE2_EXTRA_MATCH_LINE and coMatchLine;
PCRE2_EXTRA_MATCH_WORD and coMatchWord.
Implement newline type PCRE2_NEWLINE_NUL.
A lookbehind assertion that had a zero-length branch caused undefined behaviour when processed by pcre2_dfa_match.
The match limit value now also applies to pcre2_dfa_match as there are patterns that can use up a lot of resources without necessarily recursing very deeply.
Implement PCRE2_LITERAL and coLiteral.
Increased the limit for searching for a “must be present” code unit in subjects from 1000 to 2000 for 8-bit searches, since they are much faster.
Arrange for anchored patterns to record and use “first code unit” data, because this can give a fast “no match” without searching for a “required code unit”. Previously only non-anchored patterns did this.
Upgraded the Unicode tables from Unicode 8.0.0 to Unicode 10.0.0.
Update extended grapheme breaking rules to the latest set that are in Unicode Standard Annex #29.
Added experimental foreign pattern conversion facilities (pcre2_pattern_convert and friends).
If a hyphen that follows a character class is the last character in the class, Perl does not give a warning. PCRE2 now also treats this as a literal.
PCRE2 was not throwing an error for [\d-X] (and similar escapes), as is documented.


Only for V.I.P
Warning! You are not allowed to view this text.