Noscrape - Python
Noscrape is a Python library that protects your web content from scraping. It uses true-type fonts with shuffled unicodes to obfuscate text, making it unreadable by standard methods while still displaying correctly in browsers. This tool helps developers secure their websites from data harvesting and unauthorized scraping.
Installation
To begin, install the noscrape library using pip
:
pip install noscrape
New Instance
Create a new instance of the noscrape class by providing the path to your desired font file. This sets up noscrape to use the specified font for obfuscating text:
from noscrape import Noscrape
noscrape_instance = Noscrape("/path/to/font.ttf")
Obfuscation
Obfuscate your text by calling the obfuscate method on your noscrape instance. This will convert your text into a series of unique characters from the Private Use Area (PUA) of Unicode, making it unreadable by standard means:
text = noscrape.obfuscate("text to obfuscate")
print(text)
#
# one could also obfuscate an integer
obfuscatedInt = noscrape.obfuscate(1337)
# or an list
input_list = ["text1", 1234, "this is the output", 5678]
obfuscated_list = noscrape.obfuscate(input_list)
print("%s" % json.dumps(obfuscated_list, ensure_ascii=False))
# ["", "", "", ""]
Rendering
Render the obfuscated font into a Base64-encoded string using the render method. This encoded string can then be embedded directly into your HTML for easy use:
b64_font = noscrape.render()
print(b64_font)
# T1RUTwAJAIAAAwAQQ0ZGIIaTIyUAAAVkAAARe09TLzL4TxrlAAABAAAAAGBjbWFwACT...
Putting it together
Combine everything into a complete HTML document. Embed the Base64-encoded font directly in your HTML using the @font-face rule and apply the obfuscated font to your text with CSS. This ensures that the obfuscated text is displayed correctly in the browser:
<html lang="en">
<head>
<title>Noscrape - DEMO</title>
<style>
@font-face {
font-family: 'noscrape-obfuscated';
src: url('data:font/truetype;charset=utf-8;base64,{{b64_font}}');
}
</style>
</head>
<body>
<div style="font-family: 'noscrape-obfuscated'">{{obfuscated_text}}</div>
</body>
</html>