YuPcre2 1.12.0 for Delphi 10.3 Rio Cracked
YuPcre2 1.12.0 for Delphi 10.3 Rio Cracked
YuPcre2 is a library of Delphi components and procedures that implement regular expression pattern matching using the same syntax and semantics as Perl, with just a few differences. There are two matching algorithms, the standard Perl and alternative DFA algorithm:
The Perl algorithm is what you are used to from Perl and j@vascript. It is fast and supports the complete pattern syntax. You will likely be using it most of the time.
DFA is a special purpose algorithm. If finds all possible matches and, in particular, it finds the longest. It never backtracks and supports partial matching better, in particular multi-segment matching of very long subject strings.
YuPcre2 has native interfaces for 8-bit, 16-bit, and 32-bit strings. Component wrappers are available for UnicodeString / WideString and AnsiString / Utf8String / RawBytestring:
The YuPcre2 RegEx2 classes descend from common ancestors which implement the core functionalities:
Match strings and and extract full or substring matches.
Search for regular expressions within streams and memory buffers. TDIRegExSearchStream descendants employ a buffered search within streams and files (of virtually unlimited size) and use little memory.
Replace full matches or partial substrings.
List full matches or partial substrings.
Format full matches or partial substrings by adding static or dynamic text.
Users familiar with the DIRegEx might be interessted in the differences between YuPcre2 and DIRegEx.
Pattern Syntax
YuPcre2 RegEx2 Workbench Application The YuPcre2 regular expression pattern syntax is mostly compatible with Perl. It includes the following:
Quoting
Escaped Characters
Character Types
General Category Properties for \p and \P
PCRE2 Special Category Properties for \p and \P
Script Names for \p and \P
Character Classes
Quantifiers
Anchors and Simple Assertions
Match Point Reset
Alternation
Capturing
Atomic Groups
Comment
Option Setting
Newline Convention
What \R Matches
Lookahead and Lookbehind Assertions
Backreferences
Subroutine References (possibly recursive)
Conditional Patterns
Backtracking Control
Callouts
YuPcre2 RegEx2 String Processing
YuPcre2 can Replace, List, or Format regular expressions matches or any of its substrings, useful for text editors and word processors. Variable portions of the match can be included into the result text. The full match can be referenced by number, substrings also by name. The character to introduce these reference is freely configurable. FormatOptions allow to turn features on or off as required.
Replace returns the original subject string with matches replaced, similar to but more flexible than Delphi's StringReplace() function.
List collects all string matches into a single string. It extracts multiple phone numbers, e-mail addresses, or URLs, with a single call.
YuPcre2 RegEx2 MaskControls
The YuPcre2 RegEx2 MaskControls Demo ApplicationYuPcre2 includes two regular expression mask edits: TDIRegEx2MaskEdit and TDIRegEx2ComboBox. Both controls validate keyboard input against a regular expression. They work similar to Delphi's TMaskEdit, but more flexible and powerful.
The regular expression mask edits can:
accept / reject specific characters at determined positions;
allow / reject particular characters if they follow defined character(s);
restrict input text to begin / end with exact character(s);
flag incomplete text to show that more input is needed.
Examples: Numbers, number ranges, dates, phone numbers, e-mail addresses, URLs, currency, and more.
Workbench Application
The YuPcre2 RegEx2 Workbench helps to design and test regular expressions. It allows to set options, measure execution times, and to save and load settings for later use.
The YuPcre2 RegEx2 Workbench is available as
Design-Time Component Editor and
Standalone Application.
YuPcre2 1.12.0 – 24 Dec 2019
Add a check for the maximum number of capturing subpatterns, which is 65535.
Improve the invalid utf32 support of the JIT compiler. Now it correctly detects invalid characters in the 0xd800-0xdfff range.
Fix minor typo bug in JIT compile when \X is used in a non-UTF string.
Add support for matching in invalid UTF strings to the pcre2_match interpreter, and integrate with the existing JIT support via the new PCRE2_MATCH_INVALID_UTF compile-time option.
Adjust the limit for “must have” code unit searching, in particular, increase it substantially for non-anchored patterns.
Allow (*ACCEPT) to be quantified, because an ungreedy quantifier with a zero minimum is potentially useful.
Some changes to the way the minimum subject length is handled:
When PCRE2_NO_START_OPTIMIZE is set, no minimum length is computed.
An incorrect minimum length could be calculated for a pattern that contained (*ACCEPT) inside a qualified group whose minimum repetition was zero, for example A(?:(*ACCEPT))?B, which incorrectly computed a minimum of 2. The minimum length scan no longer happens for a pattern that contains (*ACCEPT).
When no minimum length is set by the normal scan, but a first and/or last code unit is recorded, set the minimum to 1 or 2 as appropriate.
When a pattern contains multiple groups with the same number, a back reference cannot know which one to scan for a minimum length. This used to cause the minimum length finder to give up with no result. Now it treats such references as not adding to the minimum length (which it should have done all along).
Furthermore, the above action now happens only if the back reference is to a group that exists more than once in a pattern instead of any back reference in a pattern with duplicate numbers.
A (*MARK) value inside a successful condition was not being returned by the interpretive matcher (it was returned by JIT). This bug has been mended.
The quantifier {1} was always being ignored, but this is incorrect when it is made possessive and applied to an item in parentheses, because a parenthesized item may contain multiple branches or other backtracking points, for example (a|ab){1}+c or (a+){1}+a.
DFA matching (using pcre2_dfa_match) was not recognising a partial match if the end of the subject was encountered in a lookahead (conditional or otherwise), an atomic group, or a recursion.
Check for integer overflow when computing lookbehind lengths.
Implement non-atomic positive lookaround assertions.
If a lookbehind contained a lookahead that contained another lookbehind within it, the nested lookbehind was not correctly processed. For example, if (?<=(?=(?<=a)))b was matched to “ab” it gave no match instead of matching “b”.
Implemented pcre2_get_match_data_size.
Two alterations to partial matching:
The definition of a partial match is slightly changed: if a pattern contains any lookbehinds, an empty partial match may be given, because this is another situation where adding characters to the current subject can lead to a full match. Example: c*+(?<=[bc]) with subject “ab”.
Similarly, if a pattern could match an empty string, an empty partial match may be given. Example: (?![ab]).* with subject “ab”. This case applies only to PCRE2_PARTIAL_HARD.
An empty string partial hard match can be returned for \z and \Z as it is documented that they shouldn't match.
A branch that started with (*ACCEPT) was not being recognized as one that could match an empty string.
Corrected pcre2_set_character_tables tables data type: was const C_unsigned_char_num_ptr instead of const C_uint8_t_ptr, as generated by pcre2_maketables.
Upgraded to Unicode 12.1.0.
If the length of one branch of a group exceeded 65535 (the maximum value that is remembered as a minimum length), the whole group's length was incorrectly recorded as 65535, leading to incorrect “no match” when start-up optimizations were in force.
The “rightmost consulted character” value was not always correct; in particular, if a pattern ended with a negative lookahead, characters that were inspected in that lookahead were not included.
Add the pcre2_maketables_free function.
The start-up optimization that looks for a unique initial matching code unit in the interpretive engines uses memchr() in 8-bit mode. When the search is caseless, it was doing so inefficiently, which ended up slowing down the match drastically when the subject was very long. The revised code (a) remembers if one case is not found, so it never repeats the search for that case after a bumpalong and (b) when one case has been found, it searches only up to that position for an earlier occurrence of the other case. This fix applies to both interpretive pcre2_match and to pcre2_dfa_match.
While scanning to find the minimum length of a group, if any branch has minimum length zero, there is no need to scan any subsequent branches (a small compile-time performance improvement).
Add underflow check in JIT which may occur when the value of subject string pointer is close to 0.
Arrange for classes such as [Aa] which contain just the two cases of the same character, to be treated as a single caseless character. This causes the first and required code unit optimizations to kick in where relevant.
Improve the bitmap of starting bytes for positive classes that include wide characters, but no property types, in UTF-8 mode. Previously, on encountering such a class, the bits for all bytes greater than $c4 were set, thus specifying any character with codepoint >= $100. Now the only bits that are set are for the relevant bytes that start the wide characters. This can give a noticeable performance improvement.
If the bitmap of starting code units contains only 1 or 2 bits, replace it with a single starting code unit (1 bit) or a caseless single starting code unit if the two relevant characters are case-partners. This is particularly relevant to the 8-bit library, though it applies to all. It can give a performance boost for patterns such as [Ww]ord and (word|WORD). However, this optimization doesn't happen if there is a “required” code unit of the same value (because the search for a “required” code unit starts at the match start for non-unique first code unit patterns, but after a unique first code unit, and patterns such as a*a need the former action).
If a non-ASCII character was the first in a starting assertion in a caseless match, the “first code unit” optimization did not get the casing right, and the assertion failed to match a character in the other case if it did not start with the same code unit.
Detect empty matches in JIT.
Fix a JIT bug which allowed to read the fields of the compiled pattern before its existence is checked.
Capturing groups that contained recursive back references to themselves are no longer atomic.
Only for V.I.P
Warning! You are not allowed to view this text.