{"id":86749,"date":"2023-02-04T09:00:27","date_gmt":"2023-02-04T03:30:27","guid":{"rendered":"https:\/\/techvidvan.com\/tutorials\/?p=86749"},"modified":"2023-02-04T09:00:27","modified_gmt":"2023-02-04T03:30:27","slug":"python-regular-expression","status":"publish","type":"post","link":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/","title":{"rendered":"Python Regular Expression"},"content":{"rendered":"<p>Regular Expressions, often known as regex, are a series of characters used to determine<span style=\"font-weight: 400\"> whether or not a pattern is present in a given text (string). Regular expressions have been utilized in word processors, text editors, and search and replace functions for some time now. They can be used to parse text data files to discover, alter, or delete specific strings, validate the format of email addresses or passwords on the server side during registration, and more. In addition, they assist in text data manipulation, which is frequently a requirement for data science initiatives, including text mining.<\/span><\/p>\n<p><span style=\"font-weight: 400\">You will be guided through the key principles of regular expressions with Python. You will begin by importing the regular expressions-supporting Python module called &#8216;re&#8217;. Following that, you&#8217;ll see how wild or exceptional characters are employed to make matches between basic\/ordinary characters. You&#8217;ll then discover how to use repeats in regular expressions. Next, you&#8217;ll learn how to organize your search into groups and named groups for quick access to matches. You&#8217;ll then become acquainted with the idea of greedy vs non-greedy matching.<\/span><\/p>\n<p><span style=\"font-weight: 400\">This already seems like a lot, so we&#8217;ve provided a helpful summary table with brief definitions to make it easier for you to recall what you&#8217;ve already seen. Do take a look!<\/span><\/p>\n<p><span style=\"font-weight: 400\">The &#8216;re&#8217; library&#8217;s many helpful methods, including compile(), search(), findall(), sub() for search and replace, split(), and others, are also covered in this course. Additionally, you will discover compilation flags, which you can utilize to improve your regex.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Use of Regular Expressions in Python<\/span><\/h3>\n<p><span style=\"font-weight: 400\">The &#8216;re&#8217; module in Python supports regular expressions. It means you must import this module using the import: import re command to begin using them in your Python scripts.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Many functions are available in the Python re library, making it a topic worth knowing. Some of them will be seen up close.<\/span><\/p>\n<p><span style=\"font-weight: 400\"><strong>Input<\/strong>:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">pattern = r\"Cookie\"\nsequence = \"Cookie\"\nif re.match(pattern, sequence):\n    print(\"Matched\")\nelse: print(\"Not a match\")\n<\/pre>\n<p><span style=\"font-weight: 400\"><strong>Output<\/strong>:<\/span><\/p>\n<div class=\"code-output\">MatchedAs you can see in the sample, most alphabets and characters will match one another.<\/p>\n<p><span style=\"font-weight: 400\">If the text matches the pattern, the match() function returns a match object. It returns None if. There are several additional functions in the &#8216;re&#8217; module that you will learn about later.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Let&#8217;s concentrate on common people for the time being!<\/span><\/p>\n<p><span style=\"font-weight: 400\">Do you see the r at the pattern&#8217;s beginning, Cookie?<\/span><\/p>\n<p><span style=\"font-weight: 400\">An uncooked string literal is what this is. It alters the meaning of the string literally. These literals are saved precisely how they appear.<\/span><\/p>\n<p><span style=\"font-weight: 400\">For instance, when preceded by re, the character is read as a simple backslash rather than an escape sequence. Using special characters, you can see what this implies. The raw r prefix prevents backslash-escaped characters from being read as escape sequences when the syntax calls for them occasionally.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Basic Python Regular Expression<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Regular expressions&#8217; ability to specify patterns rather than just fixed characters gives them power. The simplest patterns that match single characters are listed below:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Ordinary characters like a, X, and 9 match each other exactly. Because they have unique meanings, the following meta-characters do not match themselves: $ * +? [] | () (see details below). A period \u2014 except newline \u2014 matches any single character. A &#8220;word&#8221; character, such as a letter, number, or underbar [a-zA-Z0-9_], is represented by the character &#8220;n&#8221; (lowercase w). Although the mnemonic for this is &#8220;word,&#8221; it only fits a single word, char, not the entire word. W (capital W) matches any character that isn&#8217;t a word.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">b \u2014 the line separating words from non-words.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">A single whitespace character, such as space, newline, return, tab, or form [nrtf], is represented by the lowercase letter s. Any non-whitespace character matches the symbol (upper case S).<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Tab, newline, return, and decimal digit (between 0 and 9). (<\/span>Older regex utilities may not be compatible with)<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">= start, $ = end \u2014 match the start or end of the string \u2014 negate the &#8220;specialness&#8221; of a character; however, they all support w and s. Use. to match a period, for instance, or a slash. Try adding a slash in front of a character, like &#8220;@,&#8221; if you&#8217;re unsure about any additional meaning existing. Your Python program will crash if the escape sequence is invalid, such as c.<\/span><\/li>\n<\/ul>\n<h3>Basic Examples of Python Regular Expression<\/h3>\n<p><span style=\"font-weight: 400\">Regular expressions&#8217; fundamental guidelines for looking for patterns in strings are as follows:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">The search traverses the string from beginning to end, halting at the first match discovered.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">All of the patterns but not all of the strings must match.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">The match is None and is specifically a match if match = re.search(pat, str) is successful.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">group() is the text that matches.<\/span><\/li>\n<\/ul>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">match = re.search(r'i', 'pg') # found, match.group() == \"i\"\n  match = re.search(r'is', 'pg') # not found, match == None\n\n  ## . = any char but \\n\n  match = re.search(r'g', 'pg') # found, match.group() == \"ig\"\n\n  ## \\d = digit char, \\w = word char\n  match = re.search(r'\\d\\d\\d', 'p123g') # found, match.group() == \"123\"\n  match = re.search(r'\\w\\w\\w', '@@abcd!!') # found, match.group() == \"abc\"\n<\/pre>\n<h3><span style=\"font-weight: 400\">Repetition in Python Regular Expression<\/span><\/h3>\n<p><span style=\"font-weight: 400\">When you indicate repetition in the pattern using + and *, things become more interesting.<\/span><\/p>\n<p><span style=\"font-weight: 400\">+\u2014 One or more instances of the pattern to its left, for example, &#8220;i+&#8221; = one or more i&#8217;s.<\/span><\/p>\n<p><span style=\"font-weight: 400\">*\u2014 does the pattern to its left have zero or more occurrences?\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">?&#8211; compare it with 0 or 1 instance of the pattern on its left.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Working with Group Extraction<\/span><\/h3>\n<p><span style=\"font-weight: 400\">A regular expression&#8217;s &#8220;group&#8221; function enables you to select certain sections of the matching text. Let&#8217;s say we want to extract the username and host individually for the email issue. To accomplish this, surround the username and host in the pattern with parenthesis (), as in the format r'([w.-]+)@([w.-]+)&#8217;. The parenthesis creates logical &#8220;groups&#8221; within the match text rather than altering what the pattern will match. When a search is successful, match.group(1) represents the match text for the first left parenthesis and match.group(2) represents the text for the second left parenthesis. The entire match text is still contained in the standard match.group().<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">str = 'purple welcome@techvidvan.com monkey dishwasher'\nmatch = re.search(r'([\\w.-]+)@([\\w.-]+)', str)\nif match:\n  print(match.group())   ## 'welcome@techvidvan.com' (the whole match)\n  print(match.group(1))  ## 'welcome' (the username, group 1)\n  print(match.group(2))  ## 'techvidvan.com' (the host, group 2)\n<\/pre>\n<p><span style=\"font-weight: 400\">Writing a pattern for the object you&#8217;re seeking and using parenthesis groups to extract the portions you want is a standard regular expressions procedure.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Working with findall<\/span><\/h3>\n<p><span style=\"font-weight: 400\">The most potent function in the &#8216;re&#8217; module is findall(). In the example above, we used re.search() to locate the first pattern match. Each string in the list returned by the findall() function denotes a different match that was found.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">## Suppose we have a text with many email addresses\nstr = 'purple welcome@techvidvan.com, blah monkey Hey@techvidvan.com blah dishwasher'\n\n## Here re.findall() returns a list of all the found email strings\nemails = re.findall(r'[\\w\\.-]+@[\\w\\.-]+', str) ## ['welcome@techvidvan.com', 'Hey@techvidvan.com']\nfor an email in emails:\n  print(email)\n<\/pre>\n<p><span style=\"font-weight: 400\">For files, you could have created a loop that iterates through the file&#8217;s lines before calling findall() on each line. Better still, let findall() handle the iteration for you. Findall() will provide a list of all matches in one step <\/span>if the entire file text is passed to it; keep in mind that f.read() returns the entire contents of a file as a single string<span style=\"font-weight: 400\">.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">str = 'purple welcome@techvidvan.com, blah monkey Hey@techvidvan.com blah dishwasher'\ntuples = re.findall(r'([\\w\\.-]+)@([\\w\\.-]+)', str)\nprint(tuples)  ## [('welcome', 'techvidvan.com'), ('Hey', 'techvidvan.com')]\nfor tuple in tuples:\n  print(tuple[0])  ## username\n  print(tuple[1])  ## host\n<\/pre>\n<p><span style=\"font-weight: 400\">When you have a list of tuples, you can use a loop to do calculations on each tuple individually. Findall() provides a list of found strings in the same manner as in earlier examples if the pattern has no parenthesis<\/span>. A list of strings corresponding to that single group is returned by findall()<span style=\"font-weight: 400\"> if the pattern contains just one set of parenthesis. (Hideous optional feature: You may occasionally see paren() groups in the pattern you do not want to remove. If so, begin the parens with a?:, as in (?:). Then, the left paren won&#8217;t be counted as a group result.)<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Conclusion:<\/span><\/h3>\n<p><span style=\"font-weight: 400\">We learned about regular expressions in this article and how effectively they match text patterns. This article has given a basic overview of regular expressions suitable for our Python activities and demonstrates how they operate in Python. Support for regular expressions is available from the Python &#8220;re&#8221; module.<\/span><\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Regular Expressions, often known as regex, are a series of characters used to determine whether or not a pattern is present in a given text (string). Regular expressions have been utilized in word processors,&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":86971,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1053],"tags":[4833],"class_list":["post-86749","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python","tag-python-regular-expression"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.7 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Python Regular Expression - TechVidvan<\/title>\n<meta name=\"description\" content=\"Learn about Regular expressions in Python &amp; how they operate. Support for regular expressions is available from the Python &quot;re&quot; module.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Python Regular Expression - TechVidvan\" \/>\n<meta property=\"og:description\" content=\"Learn about Regular expressions in Python &amp; how they operate. Support for regular expressions is available from the Python &quot;re&quot; module.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/\" \/>\n<meta property=\"og:site_name\" content=\"TechVidvan\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/TechVidvan\/\" \/>\n<meta property=\"article:published_time\" content=\"2023-02-04T03:30:27+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2023\/01\/getting-started-with-pytho-regular-expression.webp\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/webp\" \/>\n<meta name=\"author\" content=\"TechVidvan Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:site\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"TechVidvan Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Python Regular Expression - TechVidvan","description":"Learn about Regular expressions in Python & how they operate. Support for regular expressions is available from the Python \"re\" module.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/","og_locale":"en_US","og_type":"article","og_title":"Python Regular Expression - TechVidvan","og_description":"Learn about Regular expressions in Python & how they operate. Support for regular expressions is available from the Python \"re\" module.","og_url":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/","og_site_name":"TechVidvan","article_publisher":"https:\/\/www.facebook.com\/TechVidvan\/","article_published_time":"2023-02-04T03:30:27+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2023\/01\/getting-started-with-pytho-regular-expression.webp","type":"image\/webp"}],"author":"TechVidvan Team","twitter_card":"summary_large_image","twitter_creator":"@vidvantech","twitter_site":"@vidvantech","twitter_misc":{"Written by":"TechVidvan Team","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/#article","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/"},"author":{"name":"TechVidvan Team","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22"},"headline":"Python Regular Expression","datePublished":"2023-02-04T03:30:27+00:00","mainEntityOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/"},"wordCount":1216,"commentCount":0,"publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2023\/01\/getting-started-with-pytho-regular-expression.webp","keywords":["Python Regular Expression"],"articleSection":["Python Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/","url":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/","name":"Python Regular Expression - TechVidvan","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/#website"},"primaryImageOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/#primaryimage"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2023\/01\/getting-started-with-pytho-regular-expression.webp","datePublished":"2023-02-04T03:30:27+00:00","description":"Learn about Regular expressions in Python & how they operate. Support for regular expressions is available from the Python \"re\" module.","breadcrumb":{"@id":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/#primaryimage","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2023\/01\/getting-started-with-pytho-regular-expression.webp","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2023\/01\/getting-started-with-pytho-regular-expression.webp","width":1200,"height":628,"caption":"python regular expression"},{"@type":"BreadcrumbList","@id":"https:\/\/techvidvan.com\/tutorials\/python-regular-expression\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/techvidvan.com\/tutorials\/"},{"@type":"ListItem","position":2,"name":"Python Regular Expression"}]},{"@type":"WebSite","@id":"https:\/\/techvidvan.com\/tutorials\/#website","url":"https:\/\/techvidvan.com\/tutorials\/","name":"TechVidvan Blogs","description":"","publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/techvidvan.com\/tutorials\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/techvidvan.com\/tutorials\/#organization","name":"TechVidvan","url":"https:\/\/techvidvan.com\/tutorials\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","width":200,"height":50,"caption":"TechVidvan"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/TechVidvan\/","https:\/\/x.com\/vidvantech"]},{"@type":"Person","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22","name":"TechVidvan Team","description":"The TechVidvan Team delivers practical, beginner-friendly tutorials on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our experts are here to help you upskill and excel in today\u2019s tech industry."}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/86749","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/comments?post=86749"}],"version-history":[{"count":0,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/86749\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media\/86971"}],"wp:attachment":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media?parent=86749"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/categories?post=86749"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/tags?post=86749"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}