Rule - Page html / Text
Keyword Presence Content
Regex:
<meta name="keywords" content="([^"]+)"
Find canonical values (Done) - https://optimere.atlassian.net/browse/MON-4625
Regex:
<link rel="canonical" href="([^"]+)"
Find Content of Meta Tags
Regex:
(meta name=).*(print|bu|shared).*(content=")[A-Z]+. - Regex not matching
Working one:
<meta\s+[^>]*?content=["']([^"']*)["'][^>]*?>
Check for non Block spelled headings
Regex:
^[A-Z][a-z]+
H1 presence
Regex:
<h1[^>]*>(.+)((\s)+(.+))+<\/h1> - changed to: <h1[^>]*>([\s\S]*?)<\/h1> (multiple lines)
Find “word” without brackets
Regex:
(?<!\()word(?!\))
Look for phone numbers
Regex:
(([0-9]{2,4}\s.[0-9]{1,4}\s.[0-9]{1,4})|([0-9]{2,}\s.[0-9]{0,1}\s.[0-9]{2,4}\s.[0-9]{2,4})|(132866))
Search word inside a tag (i.e: article)
Regex:
<article.*?(?:([WwOoRrDd]).*?)+<\/article>
Find all emails outside A element
Regex:
[a-z0-9\._%+!$&*=^|~#%'`?{}/\-]+@([a-z0-9\-]+\.){1,}([a-z]{2,16})(?![^<>]*>|[^"]*?<\/a)
Find phone nrs (8 digits) outside A elements
((?:45\s)?(?:\d{2}\s){3}\d{2})(?![^<>]*>|[^"]*?<\/a)
Incorrect usage of digit separators - example include comma instead of dot on a value
(?:^|[[:blank:]])((?:\d{1,3}\.)*\d{3},\d+)(?:[[:blank:]]|$)
rule - Meta header
Find Meta Keywords with more than 5 words
Meta name - Keywords
Regex:
([^,]*,){4,}([^,]*)
Find empty tags (Done) - https://optimere.atlassian.net/browse/MON-4625
(<\w+>)+[ \n(<br>)]*(<\/\w+>)
NEW
https://monsido.canny.io/policies/p/addition-to-premade-policies
Images that contain links
<a href=".?"><(img|png|jpg) src=".?" or <a[^>]*><img[^>]*src="([^"]+)"[^>]*><\/a>
Alt tag that contains images
alt=".*?\b(img|png|jpg)\b.*?"
https://monsido.canny.io/policies/p/add-a-policy-type-where-we-can-search-after-css-property
Search after css property, ex uppercase text done by css, or color set in css
To search for uppercase text done by CSS:
text-transform\s*:\s*uppercase;
To search for a color set in CSS:
(color|background(-color)?)\s*:\s*#[A-Fa-f0-9]+;
Missing title rule
<title>\s*<\/title>
Important:https://optimere.atlassian.net/browse/MS-3193 link policies:
Find all Dropbox linksSearch for unsafe linksLinks that contain Lorem ipsum
we could convert them to page policies:
Find all pages with Dropbox linksSearch for pages with unsafe linksFind pages with links that contain Lorem ipsum
Policy that can find a certain amount of domains/URLs from 3rd party
- Nespresso
https://github.com/Monsido/frontend/blob/master/src/client/app/modules/global-policy/constants/policy-exchange-center-db.constant.js#L1
https://github.com/Monsido/frontend/tree/master/src/client/app/forms/global-policy/steps/pre-content/rules
Should contain: | Fields | context |
---|---|---|
Text field: email | Make a search on the email that should not be matching so can find all other emails | |
Text field: nr of email included | search number | number of times the search should be made |
Find domains from 3rd party
1st - Regex to find domains
2nd - Conform with → All domains
- Not conform with → Regex domains the user wants to have
With → regex on script - JS (script) Source (src) Links
Not with → internal ( should contain the domains that are internal and not for search)