Check the number of requests sent to a webpage

主宰稳场 提交于 2021-02-19 06:38:05

问题


I am writing a Java multithreaded application that hits millions and sometimes billions of URLs of different web servers. The idea is to check if those URLs gives a valid 200OK response or 404/some other code.

How can I know if my program is not causing high traffic on their servers? I don't want a DOS attack to happen.There are almost ~ 8 million URLs of each server .

To check traffic this I created a simple webpage hosted at http://localhost:8089/ .My application is hitting this page 2 million times not all at once but one by one . I want to know efficiency of my code in the sense I don't overload network traffic.

netstat shows a lot of threads in TIME_WAIT.

How can I know the traffic to this page. Is there any software I can use to track this?or any Linux command or something . I am open to suggestions too.


回答1:


Remember the time at which you accessed each server, using a synchronized Map.

One approach is to check the previous access time, and if necessary, sleep in the method itself:

private static final Map<URL, Instant> lastAccessTimes = new HashMap<>();

private static final Duration COOLDOWN = Duration.ofSeconds(60);

public void contact(URL url)
throws IOException,
       InterruptedException {

    URL server = new URL(url.getProtocol() + ":" + url.getAuthority());

    Instant now = Instant.now();
    long delay = 0;

    synchronized (lastAccessTimes) {
        Instant newAccessTime = now;

        Instant lastAccess = lastAccessTimes.get(server);
        if (lastAccess != null) {
            Instant soonestAllowed = lastAccess.plus(COOLDOWN);
            if (now.isBefore(soonestAllowed)) {
                newAccessTime = soonestAllowed;
                delay = now.until(soonestAllowed, ChronoUnit.NANOS);
            }
        }

        lastAccessTimes.put(server, newAccessTime);
    }

    if (delay > 0) {
        TimeUnit.NANOSECONDS.sleep(delay);
    }

    URLConnection connection = url.openConnection();
    // etc.
}

If you don’t want to risk holding up your thread pool, you can use your executor to try again after the cooldown period. For example, using the default thread pool of CompletableFuture:

private static final Map<URL, Instant> lastAccessTimes = new HashMap<>();

private static final Duration COOLDOWN = Duration.ofSeconds(60);

public void contact(URL url) {
    URL server;
    try {
        server = new URL(url.getProtocol() + ":" + url.getAuthority());
    } catch (MalformedURLException e) {
        logger.log(Level.WARNING,
            "Could not extract server from " + url, e);
        return;
    }
    
    Instant now = Instant.now();

    synchronized (lastAccessTimes) {
        Instant lastAccess = lastAccessTimes.get(server);
        if (lastAccess != null) {
            Instant soonestAllowed = lastAccess.plus(COOLDOWN);
            if (now.isBefore(soonestAllowed)) {
                long delay = now.until(soonestAllowed, ChronoUnit.NANOS);

                CompletableFuture.runAsync(() -> contact(url),
                    CompletableFuture.delayedExecutor(
                        delay, TimeUnit.NANOSECONDS));
                return;
            }
        }

        lastAccessTimes.put(server, now);
    }

    try {
        URLConnection connection = url.openConnection();
        // etc.
    } catch (IOException e) {
        logger.log(Level.WARNING, "Could not contact " + url, e);
    }
}


来源:https://stackoverflow.com/questions/64729950/check-the-number-of-requests-sent-to-a-webpage

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!