It's another security issue .
In general , By watermark, we mean the platform user name watermark on the corner of the image . Similar to the one in the picture below , Usually just upload the image to the platform , The platform will embed a watermark in the image , Of course , Some platforms will also provide switches to set whether to display the watermark , Or the watermark will be added when the settings are saved .
Clear watermark
The implementation of this kind of watermark is relatively simple , It's to combine two pictures into one , Or just draw the content on the original image :
<img id="pic" src="https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/f3c3c98ebfce4ae28db981dfabedc1d8~tplv-k3u1fbpfcp-zoom-1.image" alt=" Original picture " height="500" crossorigin="anonymous">
<div>Photo by Claudio Schwarz | @purzlbaum on Unsplash</div>
window.onload = () => {
const pic = document.querySelector('#pic');
const canvasNode = document.createElement('canvas');
const picWithWatermark = createImageWithWatermark(pic, canvasNode);
pic.src = picWithWatermark;
}
/**
* Create images with watermarks
* create image with watermark.
* @param {HTMLImageElement} img Picture nodes - image element.
* @param {HTMLCanvasElement} canvas canvas node - canvas element.
* @returns Processed pictures base64 - pic with watermark.
*/
const createImageWithWatermark = (img, canvas) => {
const imgWidth = img.width;
const imgHeight = img.height;
canvas.width = imgWidth;
canvas.height = imgHeight;
const ctx = canvas.getContext('2d');
ctx.drawImage(img, 0, 0, imgWidth, imgHeight);
ctx.font = '16px YaHei';
ctx.fillStyle = 'black';
ctx.fillText('Photo by Claudio Schwarz | @purzlbaum on Unsplash', 20, 20);
return canvas.toDataURL('image/jpg');
}
That's the complete code , More detailed code can be accessed github Link view .
This is what ordinary users call watermark , But for developers , Watermark contains more categories .
Such as part of our system in the company intranet ( It could be all ) You can see this watermark on the screen .
Here, the watermark color is black just to see the effect more intuitively , When you actually use this watermark , Will choose white transparent .
This kind of watermark is a bit similar to what I said before , The way to combine two pictures into one , It's just , On the front page , We use a transparent canvas The container covers the entire page , And then in canvas Draw this in “ identification ”, Used to identify the user accessing the current page , thus , Whether it's your screenshots or photos , As long as you can see the watermark on the image , We can use this watermark to track down the person who leaked this information .
Then someone might ask , Then I know the watermark is a dom The node is , Open the console and find him , Just delete it ?
The defense of clear watermark
It's really a good question , But it's not a big problem , Do you want to delete it , It's perfectly possible .
I can't control your behavior , But I can detect that you are operating this dom node , I'm sorry , I don't care how you operate this node , For the sake of safety , I definitely have to redraw the watermark .
But I don't think it's enough just to repaint the watermark , It might make you compete with me for speed , That won't work , I have to teach you a lesson , You can't get what you want , What do I do ? As long as you operate my dom, So I'll just make the page white , Then reload the page . This also achieves the goal of forbidding users to operate dom The way of nodes .
To achieve this , We need the help of js Provided MutationObserver function , This function can listen for container changes .
The code is as follows :
// Callback for container listening
const cb = function (mutationList, observer) {
for (const mutation of mutationList) {
if (mutation.type === 'childList') {
const { removedNodes = [] } = mutation;
// If the watermark container changes , Then clear the page and reload
const node = Array.prototype.find.apply(removedNodes, [(node => node.id === 'page-watermark')])
if (node) {
targetNode.innerHTML = '';
window.location.reload();
}
}
}
}
// The goal is DOM node
const targetNode = document.querySelector('#watermark-body');
// Create a listener
const observer = new MutationObserver(cb);
observer.observe(targetNode, {
attributes: true,
childList: true
});
MutationObserver
yes DOM3 Event Part of the norm , Used to replace the old Mutation Events, Safe to use .
Although the above is a global watermark , But you can also watermark only part of the content , But the cost of global watermarking is lower , The price is small , For intranet systems , At the expense of the user experience , It's not a very serious problem , It is acceptable .
Maybe someone will say it again , I turn them on dom, Let me study this dom structure , Write a crawler to crawl data , Or copy it directly dom What's in it , Do you have any sense of existence ?
There is no way to disprove , But to make it clear , It's illegal to crawl data , To be legally responsible , And your crawler must be running on a computer , There's no need for watermarks , We can check it directly ip, Just track down the person , And the watermark we added is just a convenient tracking tool .
secondly , The front end fights with the reptiles , You crawl data from the web , Then I'll try not to generate text directly , Instead, replace some keywords with pictures , thus , The result of your crawling , It's just a bunch of useless words .
This is about the anti crawler thing . Get down to business , up to now , We've been talking about Ming watermark , For the intranet , There's no problem using this kind of watermark , But what about external websites ? If you add this kind of watermark , Obviously not , It's unacceptable to sacrifice the user experience here .
So we started thinking about , Can you add a watermark that is invisible to the naked eye ?
Dark watermark
Of course, no problem , This is what we're going to talk about next .
Just by name , Dark and bright watermarks are just the opposite , We can't see the watermark , And this kind of watermark is not only the principle but also the implementation , The difference between them is quite big .
Let's look at the principle first .
I don't know if you've heard of , steganography [1]. For this mysterious noun ,wiki That's how it's described “ Steganography is a technology and science of information hiding , The so-called information hiding means that no one other than the intended receiver is allowed to know the information transmission event or information content .”, Its essence , It's still cryptography .
Add document content
We can write information to pictures in various ways , The most common way is to write the content that needs to be steganized into the image in binary form , Let's take a simple example here , Take the picture below as an example :
This is the picture we quoted at the beginning , Recorded as the original image , Save the picture locally (original.png), Carry out orders :
tail -c 50 1.png
You can see that there is a string of garbled code in the execution result ( use Hex The viewer can see the binary stream of the file , Here is utf-8, Garbled code is normal ), Execute the command on the file :
cat original.png > result.png
echo testWrite >> result.png
tail -c 50 result.png
After we generate a new image , Append a string of characters to the end of the picture , You can see that the picture is still normal , View the content of the picture at the same time , You can see what you just wrote testWrite character string :
in addition , Adding a string to the head of a file doesn't work , Because the file header contains the file format and other information . If you insert information into the head of the file , The software on the market can't identify the type of file correctly .
Yes, of course , You can design your own codec to create new file types .
It's just a way , And it's very violent , The processed image file has a certain size change compared with the original file ( But it's smaller , It can be calculated in bytes ). It's smarter to write the encrypted information to the binary stream of the image in some way , thus , Only the encryptor can get the corresponding information .
But even with complex encryption , It's not enough , Because it only guarantees that when people use the original image , We can identify the source of the image 、 The circulation route , But if you take a screenshot or take a picture , We can't get this data , Because at this time, relative to the image we have processed , He's a brand new picture .
modify RGB Component value
Let's take another example ,RGB A small change in the value of a component : Cover the picture with an invisible image , In short, I can be in a single channel of the picture ( Such as rgb Medium b passageway ) Write the watermark information into , In fact, it's still hard to understand , for instance :
Now we're going to combine the left and right images , But don't let the content of the picture on the right be observed on the picture on the left , At this time, what we need to do is to write the watermark image into this image according to certain rules rgb In the channel .
Preprocessing , Sir, the watermark on the right
code
1. adopt canvas Get two pictures of rgba data
2. Take the picture on the left as b( Blue ) Value channel -1, namely ,b & 0xfffffffe
3. Read right side b Channel data , Encounter greater than 0 Value , I'm going to put... On the left b Value channel +1, namely ,b | 0x00000001
decode
1. Get the rgba data
2. Read b Channel data , encounter b & 0x00000001 > 0 The data of , It means there is watermark information , Set it to 255, except a passageway (alpha Channels are not color channels ) Outside , The data of other channels are all set to 0
// +1,-1 It's because the magnitude changes very little , It doesn't affect the display of the picture
In fact, the image with black background and blue characters is the watermark data decoded , Detailed code :
It seems that this way can keep our watermark even when the user takes a screenshot ? Not really .
This is the result of decoding the screenshot , You can see clearly ,QQ The image after the screenshot is not able to decode the watermark content we need , Even after compressing the image , We might lose our watermark , So this is not a reliable watermarking method .
So how can we make sure that our watermark works at least when we take a screenshot ?
It's not that I can't , First determine where we want to add the watermark ( Determining demand ), Because the source of the image is nothing more than web search results , In other words, most of the pictures we cut come from web pages , So what we're thinking about is to overlay a watermark on the web page , Make sure that we can trace the source of the images that the user intercepts from the web page .
The general solution is still to write css, It's just that we're going to put the background at the top , At the same time, set its transparency very low .
The code is simple , In fact, you can just spread a background image all over the screen , And then opacity Set to a level that cannot be seen by the naked eye OK 了 :
window.onload = () => {
const width = document.body.clientWidth;
const height = document.body.clientHeight;
const maskDiv = document.createElement('div');
maskDiv.id = 'mask_watermark';
maskDiv.style.position = 'absolute';
maskDiv.style.backgroundImage = 'url(./1.jpg)';
maskDiv.style.backgroundRepeat = 'repeat';
maskDiv.style.visibility = '';
maskDiv.style.left = '0px';
maskDiv.style.top = '0px';
maskDiv.style.overflow = "hidden";
maskDiv.style.zIndex = "9999";
maskDiv.style.pointerEvents = "none";
maskDiv.style.opacity = 0.005;
maskDiv.style.fontSize = '20px';
maskDiv.style.color = '#000';
maskDiv.style.textAlign = "center";
maskDiv.style.width = `${width}px`;
maskDiv.style.height = `${height}px`;
maskDiv.style.display = "block";
document.body.appendChild(maskDiv);
}
On the left is the next image from the web page , On the right is PS The processed image in the tool [2], Obviously you can see the watermark we set up .
There are many ways to generate images , It can be front-end generation , It can also be sending information to the back end , The back end generates an image , Then the front end uses the image as the background image .
To get the result on the right , Not necessarily PS To deal with , It can be dealt with in other ways .
Come here , The front end is over , But maybe some people think it's not so good , Now I've watermarked the picture of my webpage , But if I keep the original ? You can use what I said before RGB That way .
Then I download the picture and capture it on the original picture , It doesn't work ? exactly , There is very little work that can be done at the front end . We can't handle it anymore , But in the image dark watermark , Or in the field of blind watermarking , There's more effective resistance to attack ( Remove watermark ) The way , For example, frequency domain 、 The transformation of airspace . This transformation is commonplace , I can only explain more .
Add two sentences
The concept of watermark is generalized , It's not that only the information displayed in a corner of the image can be called a watermark .
There's a reason why you chose to append information to the end of the file , It's not a blind choice . Any kind of file contains the end of the file , Just as the file header specifies the format information of the file , Even if you change the suffix , I can also identify the real format of the file by reading the contents of the file header .
And we know , The file suffix can be changed at will , If only through the file suffix detection , Then it can be bypassed , And then there is the security problem of arbitrary file upload .
If changing the layer blending mode doesn't work , Try modifying the image RGB curve