How many threads should I use in my Java program?

前端 未结 7 1282
日久生厌
日久生厌 2020-12-09 18:03

I recently inherited a small Java program that takes information from a large database, does some processing and produces a detailed image regarding the information. The ori

相关标签:
7条回答
  • 2020-12-09 18:43

    Threads are fine, but as others have noted, you have to be highly aware of your bottlenecks. Your algorithm sounds like it would be susceptible to cache contention between multiple CPUs - this is particularly nasty because it has the potential to hit the performance of all of your threads (normally you think of using multiple threads to continue processing while waiting for slow or high latency IO operations).

    Cache contention is a very important aspect of using multi CPUs to process a highly parallelized algorithm: Make sure that you take your memory utilization into account. If you can construct your data objects so each thread has it's own memory that it is working on, you can greatly reduce cache contention between the CPUs. For example, it may be easier to have a big array of ints and have different threads working on different parts of that array - but in Java, the bounds checks on that array are going to be trying to access the same address in memory, which can cause a given CPU to have to reload data from L2 or L3 cache.

    Splitting the data into it's own data structures, and configure those data structures so they are thread local (might even be more optimal to use ThreadLocal - that actually uses constructs in the OS that provide guarantees that the CPU can use to optimize cache.

    The best piece of advice I can give you is test, test, test. Don't make assumptions about how CPUs will perform - there is a huge amount of magic going on in CPUs these days, often with counterintuitive results. Note also that the JIT runtime optimization will add an additional layer of complexity here (maybe good, maybe not).

    0 讨论(0)
  • 2020-12-09 18:43

    number of processors is a good start; but if those threads do a lot of i/o, then might be better with more... or less.

    first think of what are the resources available and what do you want to optimise (least time to finish, least impact to other tasks, etc). then do the math.

    sometimes it could be better if you dedicate a thread or two to each i/o resource, and the others fight for CPU. the analisys is usually easier on these designs.

    0 讨论(0)
  • 2020-12-09 18:44

    On the one hand, you'd like to think Threads == CPU/Cores makes perfect sense. Why have a thread if there's nothing to run it?

    The detail boils down to "what are the threads doing". A thread that's idle waiting for a network packet or a disk block is CPU time wasted.

    If your threads are CPU heavy, then a 1:1 correlation makes some sense. If you have a single "read the DB" thread that feeds the other threads, and a single "Dump the data" thread and pulls data from the CPU threads and create output, those two could most likely easily share a CPU while the CPU heavy threads keep churning away.

    The real answer, as with all sorts of things, is to measure it. Since the number is configurable (apparently), configure it! Run it with 1:1 threads to CPUs, 2:1, 1.5:1, whatever, and time the results. Fast one wins.

    0 讨论(0)
  • 2020-12-09 18:45

    The number that your application needs; no more, and no less.

    Obviously, if you're writing an application which contains some parallelisable algorithm, then you can probably start benchmarking to find a good balance in the number of threads, but bear in mind that hundreds of threads won't speed up any operation.

    If your algorithm can't be parallelised, then no number of additional threads is going to help.

    0 讨论(0)
  • 2020-12-09 18:47

    Yes, that's a perfectly reasonable approach. One thread per processor/core will maximize processing power and minimize context switching. I'd probably leave that as-is unless I found a problem via benchmarking/profiling.

    One thing to note is that the JVM does not guarantee availableProcessors() will be constant, so technically, you should check it immediately before spawning your threads. I doubt that this value is likely to change at runtime on typical computers, though.

    P.S. As others have pointed out, if your process is not CPU-bound, this approach is unlikely to be optimal. Since you say these threads are being used to generate images, though, I assume you are CPU bound.

    0 讨论(0)
  • 2020-12-09 18:47

    After seeing your edit, it's quite possible that one thread per CPU is as good as it gets. Your application seems quite parallelizable. If you have extra hardware you can use GridGain to grid-enable your app and have it run on multiple machines. That's probably about the only thing, beyond buying faster / more cores, that will speed it up.

    0 讨论(0)
提交回复
热议问题