mirror of
https://github.com/bitcoin/bitcoin.git
synced 2025-01-15 06:12:37 -03:00
bc679829e2
937bf4335
Use std:🧵:hardware_concurrency, instead of Boost, to determine available cores (fanquake)
Pull request description:
Following discussion on IRC about replacing Boost usage for detecting available system cores, I've opened this to collect some benchmarks + further discussion.
The current method for detecting available cores was introduced in #6361.
Recap of the IRC chat:
```
21:14:08 fanquake: Since we seem to be giving Boost removal a good shot for 0.15, does anyone have suggestions for replacing GetNumCores?
21:14:26 fanquake: There is std:🧵:hardware_concurrency(), but that seems to count virtual cores, which I don't think we want.
21:14:51 BlueMatt: fanquake: I doubt we'll do boost removal for 0.15
21:14:58 BlueMatt: shit like BOOST_FOREACH, sure
21:15:07 BlueMatt: but all of boost? doubtful, there are still things we need
21:16:36 fanquake: Yea sorry, not the whole lot, but we can remove a decent chunk. Just looking into what else needs to be done to replace some of the less involved Boost usage.
21:16:43 BlueMatt: fair
21:17:14 wumpus: yes, it makes sense to plan ahead a bit, without immediately doing it
21:18:12 wumpus: right, don't count virtual cores, that used to be the case but it makes no sense for our usage
21:19:15 wumpus: it'd create a swarm of threads overwhelming any machine with hyperthreading (+accompanying thread stack overhead), for script validation, and there was no gain at all for that
21:20:03 sipa: BlueMatt: don't worry, there is no hurry
21:59:10 morcos: wumpus: i don't think that is correct
21:59:24 morcos: suppose you have 4 cores (8 virtual cores)
21:59:24 wumpus: fanquake: indeed seems that std has no equivalent to physical_concurrency, on any standard. That's annoying as it is non-trivial to implement
21:59:35 morcos: i think running par=8 (if it let you) would be notably faster
21:59:59 morcos: jeremyrubin and i discussed this at length a while back... i think i commented about it on irc at the time
22:00:21 wumpus: morcos: I think the conclusion at the time was that it made no difference, but sure would make sense to benchmark
22:00:39 morcos: perhaps historical testing on the virtual vs actual cores was polluted by concurrency issues that have now improved
22:00:47 wumpus: I think there are not more ALUs, so there is not really a point in having more threads
22:01:40 wumpus: hyperthreads are basically just a stored register state right?
22:02:23 sipa: wumpus: yes but it helps the scheduler
22:02:27 wumpus: in which case the only speedup using "number of cores" threads would give you is, possibly, excluding other software from running on the cores on the same time
22:02:37 morcos: well this is where i get out of my depth
22:02:50 sipa: if one of the threads is waiting on a read from ram, the other can use the arithmetic unit for example
22:02:54 morcos: wumpus: i'm pretty sure though that the speed up is considerably more than what you might expect from that
22:02:59 wumpus: sipa: ok, I back down, I didn't want to argue this at all
22:03:35 morcos: the reason i haven't tested it myself, is the machine i usually use has 16 cores... so not easy due to remaining concurrency issues to get much more speedup
22:03:36 wumpus: I'm fine with restoring it to number of virtual threads if that's faster
22:03:54 morcos: we should have somene with 4 cores (and  actually test it though, i agree
22:03:58 sipa: i would expect (but we should benchmark...) that if 8 scriot validation threads instead of 4 on a quadcore hyperthreading is not faster, it's due to lock contention
22:04:20 morcos: sipa: yeah thats my point, i think lock contention isn't that bad with 8 now
22:04:22 wumpus: on 64-bit systems the additional thread overhead wouldn't be important at least
22:04:23 gmaxwell: I previously benchmarked, a long time ago, it was faster.
22:04:33 gmaxwell: (to use the HT core count)
22:04:44 wumpus: why was this changed at all then?
22:04:47 wumpus: I'm confused
22:05:04 sipa: good question!
22:05:06 gmaxwell: I had no idea we changed it.
22:05:25 wumpus: sigh 
22:05:54 gmaxwell: What PR changed it?
22:06:51 gmaxwell: In any case, on 32-bit it's probably a good tradeoff... the extra ram overhead is worth avoiding.
22:07:22 wumpus: https://github.com/bitcoin/bitcoin/pull/6361
22:07:28 gmaxwell: PR 6461 btw.
22:07:37 gmaxwell: er lol at least you got it right.
22:07:45 wumpus: the complaint was that systems became unsuably slow when using that many thread
22:07:51 wumpus: so at least I got one thing right, woohoo
22:07:55 sipa: seems i even acked it!
22:07:57 BlueMatt: wumpus: there are more alus
22:08:38 BlueMatt: but we need to improve lock contention first
22:08:40 morcos: anywya, i think in the past the lock contention made 8 threads regardless of cores a bit dicey.. now that is much better (although more still to be done)
22:09:01 BlueMatt: or we can just merge #10192, thats fee
22:09:04 gribble: https://github.com/bitcoin/bitcoin/issues/10192 | Cache full script execution results in addition to signatures by TheBlueMatt · Pull Request #10192 · bitcoin/bitcoin · GitHub
22:09:11 BlueMatt: s/fee/free/
22:09:21 morcos: no, we do not need to improve lock contention first. but we should probably do that before we increase the max beyond 16
22:09:26 BlueMatt: then we can toss concurrency issues out the window and get more speedup anyway
22:09:35 gmaxwell: wumpus: yea, well in QT I thought we also diminished the count by 1 or something? but yes, if the motivation was to reduce how heavily the machine was used, thats fair.
22:09:56 sipa: the benefit of using HT cores is certainly not a factor 2
22:09:58 wumpus: gmaxwell: for the default I think this makes a lot of sense, yes
22:10:10 gmaxwell: morcos: right now on my 24/28 physical core hosts going beyond 16 still reduces performance.
22:10:11 wumpus: gmaxwell: do we also restrict the maximum par using this? that'd make less sense
22:10:51 wumpus: if someone *wants* to use the virtual cores they should be able to by setting -par=
22:10:51 sipa: *flies to US*
22:10:52 BlueMatt: sipa: sure, but the shared cache helps us get more out of it than some others, as morcos points out
22:11:30 BlueMatt: (because it means our thread contention issues are less)
22:12:05 morcos: gmaxwell: yeah i've been bogged down in fee estimation as well (and the rest of life) for a while now.. otherwise i would have put more effort into jeremy's checkqueue
22:12:36 BlueMatt: morcos: heh, well now you can do other stuff while the rest of us get bogged down in understanding fee estimation enough to review it 
22:12:37 wumpus: [to answer my own question: no, the limit for par is MAX_SCRIPTCHECK_THREADS, or 16]
22:12:54 morcos: but to me optimizing for more than 16 cores is pretty valuable as miners could use beefy machines and be less concerned by block validation time
22:14:38 BlueMatt: morcos: i think you may be surprised by the number of mining pools that are on VPSes that do not have 16 cores 
22:15:34 gmaxwell: I assume right now most of the time block validation is bogged in the parts that are not as concurrent. simple because caching makes the concurrent parts so fast. (and soon to hopefully increase with bluematt's patch)
22:17:55 gmaxwell: improving sha2 speed, or transaction malloc overhead are probably bigger wins now for connection at the tip than parallelism beyond 16 (though I'd like that too).
22:18:21 BlueMatt: sha2 speed is big
22:18:27 morcos: yeah lots of things to do actually...
22:18:57 gmaxwell: BlueMatt: might be a tiny bit less big if we didn't hash the block header 8 times for every block. 
22:21:27 BlueMatt: ehh, probably, but I'm less rushed there
22:21:43 BlueMatt: my new cache thing is about to add a bunch of hashing
22:21:50 BlueMatt: 1 sha round per tx
22:22:25 BlueMatt: and sigcache is obviously a ton
```
Tree-SHA512: a594430e2a77d8cc741ea8c664a2867b1e1693e5050a4bbc8511e8d66a2bffe241a9965f6dff1e7fbb99f21dd1fdeb95b826365da8bd8f9fab2d0ffd80d5059c
359 lines
11 KiB
C++
359 lines
11 KiB
C++
// Copyright (c) 2009-2010 Satoshi Nakamoto
|
|
// Copyright (c) 2009-2017 The Bitcoin Core developers
|
|
// Distributed under the MIT software license, see the accompanying
|
|
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
|
|
|
|
/**
|
|
* Server/client environment: argument handling, config file parsing,
|
|
* logging, thread wrappers, startup time
|
|
*/
|
|
#ifndef BITCOIN_UTIL_H
|
|
#define BITCOIN_UTIL_H
|
|
|
|
#if defined(HAVE_CONFIG_H)
|
|
#include <config/bitcoin-config.h>
|
|
#endif
|
|
|
|
#include <compat.h>
|
|
#include <fs.h>
|
|
#include <sync.h>
|
|
#include <tinyformat.h>
|
|
#include <utiltime.h>
|
|
|
|
#include <atomic>
|
|
#include <exception>
|
|
#include <map>
|
|
#include <stdint.h>
|
|
#include <string>
|
|
#include <vector>
|
|
|
|
#include <boost/signals2/signal.hpp>
|
|
#include <boost/thread/condition_variable.hpp> // for boost::thread_interrupted
|
|
|
|
// Application startup time (used for uptime calculation)
|
|
int64_t GetStartupTime();
|
|
|
|
static const bool DEFAULT_LOGTIMEMICROS = false;
|
|
static const bool DEFAULT_LOGIPS = false;
|
|
static const bool DEFAULT_LOGTIMESTAMPS = true;
|
|
extern const char * const DEFAULT_DEBUGLOGFILE;
|
|
|
|
/** Signals for translation. */
|
|
class CTranslationInterface
|
|
{
|
|
public:
|
|
/** Translate a message to the native language of the user. */
|
|
boost::signals2::signal<std::string (const char* psz)> Translate;
|
|
};
|
|
|
|
extern bool fPrintToConsole;
|
|
extern bool fPrintToDebugLog;
|
|
|
|
extern bool fLogTimestamps;
|
|
extern bool fLogTimeMicros;
|
|
extern bool fLogIPs;
|
|
extern std::atomic<bool> fReopenDebugLog;
|
|
extern CTranslationInterface translationInterface;
|
|
|
|
extern const char * const BITCOIN_CONF_FILENAME;
|
|
extern const char * const BITCOIN_PID_FILENAME;
|
|
|
|
extern std::atomic<uint32_t> logCategories;
|
|
|
|
/**
|
|
* Translation function: Call Translate signal on UI interface, which returns a boost::optional result.
|
|
* If no translation slot is registered, nothing is returned, and simply return the input.
|
|
*/
|
|
inline std::string _(const char* psz)
|
|
{
|
|
boost::optional<std::string> rv = translationInterface.Translate(psz);
|
|
return rv ? (*rv) : psz;
|
|
}
|
|
|
|
void SetupEnvironment();
|
|
bool SetupNetworking();
|
|
|
|
struct CLogCategoryActive
|
|
{
|
|
std::string category;
|
|
bool active;
|
|
};
|
|
|
|
namespace BCLog {
|
|
enum LogFlags : uint32_t {
|
|
NONE = 0,
|
|
NET = (1 << 0),
|
|
TOR = (1 << 1),
|
|
MEMPOOL = (1 << 2),
|
|
HTTP = (1 << 3),
|
|
BENCH = (1 << 4),
|
|
ZMQ = (1 << 5),
|
|
DB = (1 << 6),
|
|
RPC = (1 << 7),
|
|
ESTIMATEFEE = (1 << 8),
|
|
ADDRMAN = (1 << 9),
|
|
SELECTCOINS = (1 << 10),
|
|
REINDEX = (1 << 11),
|
|
CMPCTBLOCK = (1 << 12),
|
|
RAND = (1 << 13),
|
|
PRUNE = (1 << 14),
|
|
PROXY = (1 << 15),
|
|
MEMPOOLREJ = (1 << 16),
|
|
LIBEVENT = (1 << 17),
|
|
COINDB = (1 << 18),
|
|
QT = (1 << 19),
|
|
LEVELDB = (1 << 20),
|
|
ALL = ~(uint32_t)0,
|
|
};
|
|
}
|
|
/** Return true if log accepts specified category */
|
|
static inline bool LogAcceptCategory(uint32_t category)
|
|
{
|
|
return (logCategories.load(std::memory_order_relaxed) & category) != 0;
|
|
}
|
|
|
|
/** Returns a string with the log categories. */
|
|
std::string ListLogCategories();
|
|
|
|
/** Returns a vector of the active log categories. */
|
|
std::vector<CLogCategoryActive> ListActiveLogCategories();
|
|
|
|
/** Return true if str parses as a log category and set the flags in f */
|
|
bool GetLogCategory(uint32_t *f, const std::string *str);
|
|
|
|
/** Send a string to the log output */
|
|
int LogPrintStr(const std::string &str);
|
|
|
|
/** Get format string from VA_ARGS for error reporting */
|
|
template<typename... Args> std::string FormatStringFromLogArgs(const char *fmt, const Args&... args) { return fmt; }
|
|
|
|
static inline void MarkUsed() {}
|
|
template<typename T, typename... Args> static inline void MarkUsed(const T& t, const Args&... args)
|
|
{
|
|
(void)t;
|
|
MarkUsed(args...);
|
|
}
|
|
|
|
// Be conservative when using LogPrintf/error or other things which
|
|
// unconditionally log to debug.log! It should not be the case that an inbound
|
|
// peer can fill up a users disk with debug.log entries.
|
|
|
|
#ifdef USE_COVERAGE
|
|
#define LogPrintf(...) do { MarkUsed(__VA_ARGS__); } while(0)
|
|
#define LogPrint(category, ...) do { MarkUsed(__VA_ARGS__); } while(0)
|
|
#else
|
|
#define LogPrintf(...) do { \
|
|
std::string _log_msg_; /* Unlikely name to avoid shadowing variables */ \
|
|
try { \
|
|
_log_msg_ = tfm::format(__VA_ARGS__); \
|
|
} catch (tinyformat::format_error &fmterr) { \
|
|
/* Original format string will have newline so don't add one here */ \
|
|
_log_msg_ = "Error \"" + std::string(fmterr.what()) + "\" while formatting log message: " + FormatStringFromLogArgs(__VA_ARGS__); \
|
|
} \
|
|
LogPrintStr(_log_msg_); \
|
|
} while(0)
|
|
|
|
#define LogPrint(category, ...) do { \
|
|
if (LogAcceptCategory((category))) { \
|
|
LogPrintf(__VA_ARGS__); \
|
|
} \
|
|
} while(0)
|
|
#endif
|
|
|
|
template<typename... Args>
|
|
bool error(const char* fmt, const Args&... args)
|
|
{
|
|
LogPrintStr("ERROR: " + tfm::format(fmt, args...) + "\n");
|
|
return false;
|
|
}
|
|
|
|
void PrintExceptionContinue(const std::exception *pex, const char* pszThread);
|
|
void FileCommit(FILE *file);
|
|
bool TruncateFile(FILE *file, unsigned int length);
|
|
int RaiseFileDescriptorLimit(int nMinFD);
|
|
void AllocateFileRange(FILE *file, unsigned int offset, unsigned int length);
|
|
bool RenameOver(fs::path src, fs::path dest);
|
|
bool LockDirectory(const fs::path& directory, const std::string lockfile_name, bool probe_only=false);
|
|
|
|
/** Release all directory locks. This is used for unit testing only, at runtime
|
|
* the global destructor will take care of the locks.
|
|
*/
|
|
void ReleaseDirectoryLocks();
|
|
|
|
bool TryCreateDirectories(const fs::path& p);
|
|
fs::path GetDefaultDataDir();
|
|
const fs::path &GetDataDir(bool fNetSpecific = true);
|
|
void ClearDatadirCache();
|
|
fs::path GetConfigFile(const std::string& confPath);
|
|
#ifndef WIN32
|
|
fs::path GetPidFile();
|
|
void CreatePidFile(const fs::path &path, pid_t pid);
|
|
#endif
|
|
#ifdef WIN32
|
|
fs::path GetSpecialFolderPath(int nFolder, bool fCreate = true);
|
|
#endif
|
|
fs::path GetDebugLogPath();
|
|
bool OpenDebugLog();
|
|
void ShrinkDebugFile();
|
|
void runCommand(const std::string& strCommand);
|
|
|
|
/**
|
|
* Most paths passed as configuration arguments are treated as relative to
|
|
* the datadir if they are not absolute.
|
|
*
|
|
* @param path The path to be conditionally prefixed with datadir.
|
|
* @param net_specific Forwarded to GetDataDir().
|
|
* @return The normalized path.
|
|
*/
|
|
fs::path AbsPathForConfigVal(const fs::path& path, bool net_specific = true);
|
|
|
|
inline bool IsSwitchChar(char c)
|
|
{
|
|
#ifdef WIN32
|
|
return c == '-' || c == '/';
|
|
#else
|
|
return c == '-';
|
|
#endif
|
|
}
|
|
|
|
class ArgsManager
|
|
{
|
|
protected:
|
|
mutable CCriticalSection cs_args;
|
|
std::map<std::string, std::string> mapArgs;
|
|
std::map<std::string, std::vector<std::string>> mapMultiArgs;
|
|
public:
|
|
void ParseParameters(int argc, const char*const argv[]);
|
|
void ReadConfigFile(const std::string& confPath);
|
|
|
|
/**
|
|
* Return a vector of strings of the given argument
|
|
*
|
|
* @param strArg Argument to get (e.g. "-foo")
|
|
* @return command-line arguments
|
|
*/
|
|
std::vector<std::string> GetArgs(const std::string& strArg) const;
|
|
|
|
/**
|
|
* Return true if the given argument has been manually set
|
|
*
|
|
* @param strArg Argument to get (e.g. "-foo")
|
|
* @return true if the argument has been set
|
|
*/
|
|
bool IsArgSet(const std::string& strArg) const;
|
|
|
|
/**
|
|
* Return string argument or default value
|
|
*
|
|
* @param strArg Argument to get (e.g. "-foo")
|
|
* @param strDefault (e.g. "1")
|
|
* @return command-line argument or default value
|
|
*/
|
|
std::string GetArg(const std::string& strArg, const std::string& strDefault) const;
|
|
|
|
/**
|
|
* Return integer argument or default value
|
|
*
|
|
* @param strArg Argument to get (e.g. "-foo")
|
|
* @param nDefault (e.g. 1)
|
|
* @return command-line argument (0 if invalid number) or default value
|
|
*/
|
|
int64_t GetArg(const std::string& strArg, int64_t nDefault) const;
|
|
|
|
/**
|
|
* Return boolean argument or default value
|
|
*
|
|
* @param strArg Argument to get (e.g. "-foo")
|
|
* @param fDefault (true or false)
|
|
* @return command-line argument or default value
|
|
*/
|
|
bool GetBoolArg(const std::string& strArg, bool fDefault) const;
|
|
|
|
/**
|
|
* Set an argument if it doesn't already have a value
|
|
*
|
|
* @param strArg Argument to set (e.g. "-foo")
|
|
* @param strValue Value (e.g. "1")
|
|
* @return true if argument gets set, false if it already had a value
|
|
*/
|
|
bool SoftSetArg(const std::string& strArg, const std::string& strValue);
|
|
|
|
/**
|
|
* Set a boolean argument if it doesn't already have a value
|
|
*
|
|
* @param strArg Argument to set (e.g. "-foo")
|
|
* @param fValue Value (e.g. false)
|
|
* @return true if argument gets set, false if it already had a value
|
|
*/
|
|
bool SoftSetBoolArg(const std::string& strArg, bool fValue);
|
|
|
|
// Forces an arg setting. Called by SoftSetArg() if the arg hasn't already
|
|
// been set. Also called directly in testing.
|
|
void ForceSetArg(const std::string& strArg, const std::string& strValue);
|
|
};
|
|
|
|
extern ArgsManager gArgs;
|
|
|
|
/**
|
|
* Format a string to be used as group of options in help messages
|
|
*
|
|
* @param message Group name (e.g. "RPC server options:")
|
|
* @return the formatted string
|
|
*/
|
|
std::string HelpMessageGroup(const std::string& message);
|
|
|
|
/**
|
|
* Format a string to be used as option description in help messages
|
|
*
|
|
* @param option Option message (e.g. "-rpcuser=<user>")
|
|
* @param message Option description (e.g. "Username for JSON-RPC connections")
|
|
* @return the formatted string
|
|
*/
|
|
std::string HelpMessageOpt(const std::string& option, const std::string& message);
|
|
|
|
/**
|
|
* Return the number of cores available on the current system.
|
|
* @note This does count virtual cores, such as those provided by HyperThreading.
|
|
*/
|
|
int GetNumCores();
|
|
|
|
void RenameThread(const char* name);
|
|
|
|
/**
|
|
* .. and a wrapper that just calls func once
|
|
*/
|
|
template <typename Callable> void TraceThread(const char* name, Callable func)
|
|
{
|
|
std::string s = strprintf("bitcoin-%s", name);
|
|
RenameThread(s.c_str());
|
|
try
|
|
{
|
|
LogPrintf("%s thread start\n", name);
|
|
func();
|
|
LogPrintf("%s thread exit\n", name);
|
|
}
|
|
catch (const boost::thread_interrupted&)
|
|
{
|
|
LogPrintf("%s thread interrupt\n", name);
|
|
throw;
|
|
}
|
|
catch (const std::exception& e) {
|
|
PrintExceptionContinue(&e, name);
|
|
throw;
|
|
}
|
|
catch (...) {
|
|
PrintExceptionContinue(nullptr, name);
|
|
throw;
|
|
}
|
|
}
|
|
|
|
std::string CopyrightHolders(const std::string& strPrefix);
|
|
|
|
//! Substitute for C++14 std::make_unique.
|
|
template <typename T, typename... Args>
|
|
std::unique_ptr<T> MakeUnique(Args&&... args)
|
|
{
|
|
return std::unique_ptr<T>(new T(std::forward<Args>(args)...));
|
|
}
|
|
|
|
#endif // BITCOIN_UTIL_H
|