Git Product home page Git Product logo

tcmalloc's Introduction

TCMalloc

This repository contains the TCMalloc C++ code.

TCMalloc is Google's customized implementation of C's malloc() and C++'s operator new used for memory allocation within our C and C++ code. TCMalloc is a fast, multi-threaded malloc implementation.

Building TCMalloc

Bazel is the official build system for TCMalloc.

The TCMalloc Platforms Guide contains information on platform support for TCMalloc.

Documentation

All users of TCMalloc should consult the following documentation resources:

  • The TCMalloc Quickstart covers downloading, installing, building, and testing TCMalloc, including incorporating within your codebase.
  • The TCMalloc Overview covers the basic architecture of TCMalloc, and how that may affect configuration choices.
  • The TCMalloc Reference covers the C and C++ TCMalloc API endpoints.

More advanced usages of TCMalloc may find the following documentation useful:

  • The TCMalloc Tuning Guide covers the configuration choices in more depth, and also illustrates other ways to customize TCMalloc. This also covers important operating system-level properties for improving TCMalloc performance.
  • The TCMalloc Design Doc covers how TCMalloc works underneath the hood, and why certain design choices were made. Most developers will not need this level of implementation detail.
  • The TCMalloc Compatibility Guide which documents our expectations for how our APIs are used.

License

The TCMalloc library is licensed under the terms of the Apache license. See LICENSE for more information.

Disclaimer: This is not an officially supported Google product.

tcmalloc's People

Contributors

atdt avatar avieira-arm avatar bdu91 avatar ckennelly avatar compnerd avatar derekmauro avatar dvyukov avatar employedrussian avatar ezbr avatar fowles avatar jacobsa avatar jcking avatar jrajahalme avatar junyer avatar kda avatar martijnvels avatar martinmaas avatar melver avatar mkruskal-google avatar morehouse avatar mumbleskates avatar nilayvaish avatar northbadge avatar patrickxia avatar paulburton avatar q-ge avatar snehasish avatar thorvald avatar tocarip avatar v-gogte avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tcmalloc's Issues

Add Safe-Linking to Single-Linked-Lists (SLL)

I found this repository by following this link from gperftools, and now I don't really know what to do with my pull request that was sent to gperftools what I thought it is the public implementation of TCMalloc.

tl;dr - Safe-Linking is a security feature that I added on top of the Single-Linked-Lists (SLL). It is similar to the existing maskPtr() functionality that was added to Chrome's TCMalloc implementation by Chrome's security team in 2012. Safe-Linking (as a concept) is now in the process of also being merged into GLIBC's ptmalloc implementation, and also uClibc's dlmalloc implementation.

I was hoping that this security enhancement feature could be integrated in TCMalloc to help prevent attacks, as is detailed fully in the my white-paper that could be found on the original pull request. The benchmarking results for all of the different heap implementations are very encouraging, and according to gperftool's benchmarking they are 0.02% for the average test and 1.5% for the worst test case.

This feature was already implemented for gperftools and sent as a pull request, including the CLA and everything. Could you check if you could integrate it into your implementation as well? I didn't send a new pull request as the changes to my original pull request seem minor and I am having trouble to setup your dev environment on my computer...

unable to capture stacktrace with gcc __builtin options

struct stack_frame {
        struct stack_frame *prev;
        void *return_addr;
} __attribute__((packed));
typedef struct stack_frame stack_frame;

void backtrace_from_fp(void **buf, int size)
{
/*
        int i;
        stack_frame *fp;

        __asm__ __volatile__("movq %%rbp, %[fp]" :  [fp] "=r" (fp));

        for(i = 0; i < size && fp != NULL; fp = fp->prev, i++)
                buf[i] = fp->return_addr;
*/
                buf[0] =  __builtin_return_address (0);
                buf[2] =  __builtin_return_address (1);
}

My code which uses malloc

#include <malloc.h>
void f3()
{
  malloc(10);
}
void f2()
{
}
void f1()
{
}
void f()
{
}
main()
{
  f();
}

g++ -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free -lpthread pop.c .libs/libtcmalloc_and_profiler.a

Above is how I built with static library.

I do get crash as below..

Program received signal SIGSEGV, Segmentation fault.
0x000000000041f417 in backtrace_from_fp (size=10, buf=<optimized out>) at src/tcmalloc.cc:1914
1914                    buf[2] =  __builtin_return_address (1);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7_4.2.x86_64 libgcc-4.8.5-36.el7_6.2.x86_64 libstdc++-4.8.5-36.el7_6.2.x86_64
(gdb) bt
#0  0x000000000041f417 in backtrace_from_fp (size=10, buf=<optimized out>) at src/tcmalloc.cc:1914
#1  tc_malloc (size=size@entry=1) at src/tcmalloc.cc:1924
#2  0x00000000004051d6 in TCMallocGuard::TCMallocGuard (this=<optimized out>) at src/tcmalloc.cc:1121
#3  0x0000000000403498 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at src/tcmalloc.cc:1153
#4  _GLOBAL__sub_I__ZN61FLAG__namespace_do_not_use_directly_use_DECLARE_int64_instead43FLAGS_tcmalloc_large_alloc_report_thresholdE ()
    at src/tcmalloc.cc:2319
#5  0x000000000041eb5d in __libc_csu_init ()
#6  0x00007ffff6ffeb95 in __libc_start_main () from /lib64/libc.so.6
#7  0x0000000000403bf8 in _start ()
(gdb)

Any help on the same ?.

rgds
Balaji Kamal Kannadassan

On Intel CPU Skylake and newer __builtin_prefetch doesn't need to be protected with a null check

See: https://github.com/google/tcmalloc/blob/master/tcmalloc/internal/linked_list.h#L48

From Intel Manual:

The cache hierarchy of the Skylake microarchitecture has the following enhancements:
• Higher Cache bandwidth compared to previous generations.
• Simultaneous handling of more loads and stores enabled by enlarged buffers.
• Processor can do two page walks in parallel compared to one in Haswell microarchitecture and earlier
generations.
• Page split load penalty down from 100 cycles in previous generation to 5 cycles.
• L3 write bandwidth increased from 4 cycles per line in previous generation to 2 per line.
• Support for the CLFLUSHOPT instruction to flush cache lines and manage memory ordering of flushed
data using SFENCE.
Reduced performance penalty for a software prefetch that specifies a NULL pointer.
• L2 associativity changed from 8 ways to 4 ways.

Did a quick test on my machine not seeing any DTLB_LOAD_MISSES on prefetch NULL (or any address less than 4096 for that matter).

Plans to support ARM64 as architecture?

I would love to use tcmalloc on an ARM64 system, but it seems to not be officially supported and when I try to run I get the following:

external/com_google_tcmalloc/tcmalloc/system-alloc.cc:525] MmapAligned() failed (size, alignment) 33554432 33554432 @ 0x417480 0x416eb8 0x405528 0x415614 0x415410 0x427b4c 0x426bc8 0x7fa12d4144 0x 0x
external/com_google_tcmalloc/tcmalloc/arena.cc:31] FATAL ERROR: Out of memory trying to allocate internal tcmalloc data (bytes, object-size) 131072 48 @ 0x4055a8 0x415614 0x415410 0x427b4c 0x426bc8 0

A bit of debugging it looks like every time it calls the mmap with a hint it always gets back the same address (which doesn't match the hint), for example (with some extra logging):

external/com_google_tcmalloc/tcmalloc/system-alloc.cc:507] mmap (result, hint, size) 0x7fb5bad000 0x1df184000000 33554432 @ 0x46e2cc 0x46f600 0x46d468 0x46cc48 0x4231d4 0x41d534 0x469364 0x469138 0x 
external/com_google_tcmalloc/tcmalloc/system-alloc.cc:507] mmap (result, hint, size) 0x7fb5bad000 0x76520000000 33554432 @ 0x46e2cc 0x46f600 0x46d468 0x46cc48 0x4231d4 0x41d534 0x469364 0x469138 0x 0
external/com_google_tcmalloc/tcmalloc/system-alloc.cc:507] mmap (result, hint, size) 0x7fb5bad000 0x7c0a8000000 33554432 @ 0x46e2cc 0x46f600 0x46d468 0x46cc48 0x4231d4 0x41d534 0x469364 0x469138 0x 0

The system I'm testing this on has a 4.9 kernel, so I can't try with MAP_FIXED_NOREPLACE. I also don't understand why mmap is always returning the same address, not sure if this is an ARM64 specific thing or something about my particular platform (clang-9, Ubuntu 18.04, Kernel 4.9.140-tegra, running on a Jetson Nano).

In any case, are there plans to support ARM64, and if not, any thoughts on what may be going on here?

Compilation error

OS: Ubuntu 18.04 x86_64
GCC 7.5.0
I followed the installation guide. When execute bazel test //tcmalloc/..., an error occurred.

In file included from ./tcmalloc/huge_page_aware_allocator.h:25:0,
                 from tcmalloc/huge_page_aware_allocator.cc:15:
./tcmalloc/huge_page_filler.h: In instantiation of 'void tcmalloc::SkippedSubreleaseCorrectnessTracker<kEpochs>::ReportUpdatedPeak(tcmalloc::Length) [with long unsigned int kEpochs = 600; tcmalloc::Length = long unsigned int]':
./tcmalloc/huge_page_filler.h:242:9:   required from 'void tcmalloc::FillerStatsTracker<kEpochs>::Report(tcmalloc::FillerStatsTracker<kEpochs>::FillerStats) [with long unsigned int kEpochs = 600]'
./tcmalloc/huge_page_filler.h:1791:24:   required from 'void tcmalloc::HugePageFiller<TrackerType>::UpdateFillerStatsTracker() [with TrackerType = tcmalloc::PageTracker<tcmalloc::SystemRelease>]'
./tcmalloc/huge_page_filler.h:1289:27:   required from 'void tcmalloc::HugePageFiller<TrackerType>::Contribute(TrackerType*, bool) [with TrackerType = tcmalloc::PageTracker<tcmalloc::SystemRelease>]'
tcmalloc/huge_page_aware_allocator.cc:135:33:   required from here

./tcmalloc/huge_page_filler.h:89:5: error: no matching function for call to 'tcmalloc::TimeSeriesTracker<tcmalloc::SkippedSubreleaseCorrectnessTracker<600>::SkippedSubreleaseEntry, tcmalloc::SkippedSubreleaseCorrectnessTracker<600>::SkippedSubreleaseUpdate, 600>::Report(<brace-enclosed initializer list>)'
     if (tracker_.Report({.confirmed_peak = current_peak})) {
     ^~

Anybody can help?

Infinite loop on concurrent allocations

When realloc or similar function is called on same memory from two threads without sync i assume that program will crash.
But with TCMalloc it doesn't.

Minimal reproducible example:
Create std::vector<double> and push elements to it from two different threads without any locks, then suddenly program will stuck with infinite memory allocating loop until it dies from oom killer.

Infinite loop is here:
https://github.com/google/tcmalloc/blob/master/tcmalloc/peak_heap_tracker.cc#L63
cause Static::sampled_objects_.head_.next_ points to itself with next_ and prev_.
It is caused by the fact that both threads receive same Span object from HugePageAwareAllocator here:
https://github.com/google/tcmalloc/blob/master/tcmalloc/tcmalloc.cc#L1481

I am little confused why, cause HugePageAwareAllocator takes pageheap_lock before any allocations.
We are using TCMalloc from commit a643d89610317be1eff9f7298104eef4c987d8d5.
This bug is kinda critical in production where such races can pop up so I hope it can be fixed soon (or maybe it is fixed in newer versions?).

bt:

tcmalloc::PeakHeapTracker::MaybeSaveSample (this=0x4ff3c0 <tcmalloc::Static::peak_heap_tracker_>) at .../libs/tcmalloc/tcmalloc/peak_heap_tracker.cc:70
(anonymous namespace)::SampleifyAllocation (requested_size=0, requested_size@entry=2097152, weight=<optimized out>, requested_alignment=<optimized out>, requested_alignment@entry=1, cl=cl@entry=0, obj=obj@entry=0x0, span=span@entry=0x47447ec005e0, capacity=0x0) at .../libs/tcmalloc/tcmalloc/tcmalloc.cc:1469
(anonymous namespace)::do_malloc_pages (size=2097152, alignment=1) at .../libs/tcmalloc/tcmalloc/tcmalloc.cc:1529
slow_alloc<tcmalloc::TCMallocPolicy<tcmalloc::CppOomPolicy, tcmalloc::DefaultAlignPolicy, tcmalloc::InvokeHooksPolicy>, decltype(nullptr)>(tcmalloc::TCMallocPolicy<tcmalloc::CppOomPolicy, tcmalloc::DefaultAlignPolicy, tcmalloc::InvokeHooksPolicy>, unsigned long, decltype(nullptr)) (policy=..., size=2097152, capacity=<optimized out>) at .../libs/tcmalloc/tcmalloc/tcmalloc.cc:1826
std::__y1::__libcpp_allocate (__size=78420757185056, __align=8) at .../libs/cxxsupp/libcxx/include/new:261
std::__y1::allocator<double>::allocate (this=<optimized out>, __n=262144) at .../libs/cxxsupp/libcxx/include/memory:1869
std::__y1::allocator_traits<std::__y1::allocator<double> >::allocate (__a=..., __n=262144) at .../libs/cxxsupp/libcxx/include/memory:1585
std::__y1::__split_buffer<double, std::__y1::allocator<double>&>::__split_buffer (this=<optimized out>, __cap=262144, __start=131072, __a=...) at .../libs/cxxsupp/libcxx/include/__split_buffer:326
std::__y1::vector<double, std::__y1::allocator<double> >::__push_back_slow_path<double> (this=0x7ffd69d3c2e0, __x=<optimized out>) at .../libs/cxxsupp/libcxx/include/vector:1660
std::__y1::vector<double, std::__y1::allocator<double> >::push_back (this=0x7ffd69d3c2e0, __x=<optimized out>) at .../libs/cxxsupp/libcxx/include/vector:1692

Build fail

TCmalloc Version:

commit df10c10548065948d91f2bbfe7caf73cd8bfae85
Author: Chris Kennelly <[email protected]>
Date:   Thu Jun 18 09:43:17 2020 -0700

GCC version: gcc (Ubuntu 9.3.0-10ubuntu2) 9.3
bazel version: bazel 3.2.0
compile error:

ERROR: /home/dev/tcmalloc/tcmalloc/testing/BUILD:595:8: C++ compilation of rule '//tcmalloc/testing:limit_test' failed (Exit 1) gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer '-std=c++0x' -MD -MF ... (remaining 52 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
In file included from external/com_google_googletest/googletest/include/gtest/internal/gtest-death-test-internal.h:39,
                 from external/com_google_googletest/googletest/include/gtest/gtest-death-test.h:41,
                 from external/com_google_googletest/googletest/include/gtest/gtest.h:64,
                 from external/com_google_googletest/googlemock/include/gmock/internal/gmock-internal-utils.h:47,
                 from external/com_google_googletest/googlemock/include/gmock/gmock-actions.h:51,
                 from external/com_google_googletest/googlemock/include/gmock/gmock.h:59,
                 from tcmalloc/testing/limit_test.cc:25:
external/com_google_googletest/googletest/include/gtest/gtest-matchers.h: In instantiation of 'bool testing::internal::MatchesRegexMatcher::MatchAndExplain(const MatcheeStringType&, testing::MatchResultListener*) const [with MatcheeStringType = std::basic_string_view<char>]':
external/com_google_googletest/googletest/include/gtest/gtest-matchers.h:484:47:   required from 'bool testing::PolymorphicMatcher<Impl>::MonomorphicImpl<T>::MatchAndExplain(T, testing::MatchResultListener*) const [with T = const std::basic_string_view<char>&; Impl = testing::internal::MatchesRegexMatcher]'
external/com_google_googletest/googletest/include/gtest/gtest-matchers.h:483:10:   required from here
external/com_google_googletest/googletest/include/gtest/gtest-matchers.h:647:24: error: invalid initialization of reference of type 'const string&' {aka 'const std::__cxx11::basic_string<char>&'} from expression of type 'const std::basic_string_view<char>'
  647 |     const std::string& s2(s);

How to use static library artifacts in another build system

I am attempting to create a static redistributable libtcmalloc.a with the Bazel build so that the library may be brought into a different build system via the -I and -L options to gcc, but I'm not having much luck.

I can coerce Bazel to produce static libraries for everything by running from the root of the repo directory (hacky, and probably slightly wrong):

find . -name BUILD -type f -exec sed -i -e '/alwayslink = 1/d' -e '/linkstatic = 1/d' {} +
find . -name BUILD -type f -exec sed -i -r 's/cc_library\(/&\n    linkstatic = True,/' {} +

But when I run on the libtcmalloc.a that is produced:

ar x libtcmalloc.a

I only see tcmalloc.pic.o. If I compare this to the libtcmalloc_minimal.a that is produced from gperftools, I see a number of additional object files; which is more along the lines of what I was epxecting.

I've also written a rough CMake build with the help of a bazel-to-cmake conversation tool, and I saw the same problem with the produced libtcmalloc.a as well.

What would be the recommended way to handle this while also accounting for tcmalloc's new dependency on Abseil? Or is there maybe a Bazel option I'm missing? I'm interested in statically linking everything, and ideally, libtcmalloc.a would include all objects that are needed for it to be consumed by another program and not get undefined reference to 'tcmalloc::Static::transfer_cache_' for example during the linking stage.

Viewing TCmalloc memory allocations

Related to Chromium's TCmalloc, but maybe somebody here can help.

Modified Chromium TTMalloc allocator to expose pagemap_ like below:

third_party/tcmalloc/chromium/src/page_heap.h

class PERFTOOLS_DLL_DECL PageHeap {
 public:
  PageHeap();
  typedef MapSelector<kAddressBits>::Type PageMap;
  PageMap pagemap_;

third_party/tcmalloc/chromium/src/static_vars.cc

PageHeap Static::pageheap2_;

third_party/tcmalloc/chromium/src/static_vars.h

static PageHeap pageheap2_;


Exposed PageHeap via pageheap2_, but somehow 3 radix tree does not contain data/pointers? Only "0"

My script (https://github.com/marcinguy/tcmalloc-inspector) shows the same:

USED:
BLOCK SUMMARY
0 blocks, 0 total size
size frequencies:
FREE:
BLOCK SUMMARY
0 blocks, 0 total size
size frequencies:
LOST:
BLOCK SUMMARY
0 blocks, 0 total size
size frequencies:

More GDB output here: https://github.com/marcinguy/tcmalloc-chromium

Any idea why?

Did I modify the TCMalloc wrongly?

With Google's Tcmalloc in another sample program ia works correctly.

https://github.com/marcinguy/tcmalloc-inspector

Thanks,

Capturing Backtrace on memory allocation ?

Hi All,

When memory gets allocated I am capturing backtrace but the problem is that backtrace calling malloc and to avoid going in loop I have enabled flag. Now though it works fine performance is hit very badly. I was looking for some options and I came across below code..

#define _GNU_SOURCE
#include <dlfcn.h>
#include <execinfo.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>

void *
malloc (size_t size)
{
  const char *message = "malloc called\n";
  write (STDOUT_FILENO, message, strlen (message));
  void *next = dlsym (RTLD_NEXT, "malloc");
  return ((__typeof__ (malloc) *) next) (size);
}
int
main (void)
{
  /* This calls malloc.  */
  puts ("First call to backtrace.");
  void *buffer[10];
  backtrace (buffer, 10);
  backtrace (buffer, 10);
  backtrace (buffer, 10);
  /* This does not.  */
  puts ("Second call to backtrace.");
  backtrace (buffer, 10);
}

Output:
----------
./a.out
First call to backtrace.
malloc called
Second call to backtrace.

Is there anything similar that can be done with tcmalloc so that performance doesn't take a hit ?.

rgds
Balaji Kamal Kannadassan

tcmalloc not releasing memory vs kubernetes

Hi, we are using tcmalloc inside of kubernetes and we keep seeing kubelets running over memory limits even though they have no reason to.

I have a suspicion that this is tcmalloc not releasing memory. Which is desirable, however not if it trips over the oom killswitch in kubernetes. Should we just use ulimit?

How to link with tcmalloc from existing CMake project?

I use the gperftools/tcmalloc and I'm interested to try this variant.

It is unclear to me how, once I build tcmalloc using bazel, I can 'install' the build artifacts in a way I can consume them from my existing cmake project. The documentation assumes a bazel-build consumer.

Document THP Settings

For tcmalloc_huge_pages, some Linux settings are clearly required to enable transparent huge pages (THP). Can you please document the expected settings for:

  • /sys/kernel/mm/transparent_hugepage/enabled
  • /sys/kernel/mm/transparent_hugepage/defrag

built .so from latest master fails to load because of 'undefined symbol' on Ubuntu 18.04 LTS

After changing in BUILD file linkstatic = 1 to 0 and running bazel build //tcmalloc build seems to be ok.

ldd /usr/local/lib/libtcmalloc.so 
	linux-vdso.so.1 (0x00007fff104e5000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/debug/libstdc++.so.6 (0x00007f7ab7b10000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f7ab7772000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7ab755a000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7ab7169000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f7ab7ec9000)

But attempts to use it, like open it with dlopen() from C++ code fail because of undefined symbol:

./loader 
C++ dlopen demo

Opening tcmalloc.so...
Cannot open library: /usr/local/lib/libtcmalloc.so: undefined symbol: _ZN4absl19str_format_internal13FormatArgImpl8DispatchINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEbNS1_4DataENS0_24FormatConversionSpecImplEPv

Is it old libstdc++ version, or old gcc version (7.5.0)? Any sane list of requirements or a doc about how to build it ?

rules_cc archive not found

shenderson-d3jgh5:workspace shenderson$ git clone https://github.com/google/tcmalloc.git
Cloning into 'tcmalloc'...
warning: templates not found in /Users/shenderson/.git-template
remote: Enumerating objects: 247, done.
remote: Counting objects: 100% (247/247), done.
remote: Compressing objects: 100% (197/197), done.
remote: Total 247 (delta 67), reused 230 (delta 50), pack-reused 0
Receiving objects: 100% (247/247), 573.68 KiB | 3.32 MiB/s, done.
Resolving deltas: 100% (67/67), done.
shenderson-d3jgh5:workspace shenderson$ cd tcmalloc/
shenderson-d3jgh5:tcmalloc shenderson$ bazel build //...
Starting local Bazel server and connecting to it...
WARNING: Download from https://mirror.bazel.build/github.com/bazelbuild/rules_cc/archive/7e650b11fe6d49f70f2ca7a1c4cb8b
cc4a1fe239.zip failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET r
eturned 404 Not Found
INFO: Analyzed 114 targets (45 packages loaded, 1105 targets configured).
INFO: Found 114 targets...
ERROR: /Users/shenderson/workspace/tcmalloc/tcmalloc/internal/BUILD:123:1: C++ compilation of rule '//tcmalloc/internal
:logging' failed (Exit 1) wrapped_clang failed: error executing command external/local_config_cc/wrapped_clang '-D_FORT
IFY_SOURCE=1' -fstack-protector -fcolor-diagnostics -Wall -Wthread-safety -Wself-assign -fno-omit-frame-pointer -O0 -DD
EBUG '-std=c++11' -iquote . -iquote ... (remaining 31 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
warning: unknown warning option '-Wno-attribute-alias'; did you mean '-Wno-attributes'? [-Wunknown-warning-option]
tcmalloc/internal/logging.cc:22:10: fatal error: 'syscall.h' file not found
#include <syscall.h>
         ^~~~~~~~~~~
1 warning and 1 error generated.
INFO: Elapsed time: 16.659s, Critical Path: 1.07s
INFO: 5 processes: 5 darwin-sandbox.
FAILED: Build did NOT complete successfully

commit information:

git log
commit 84819522941112e40c7018870fbe9a83287097f3 (HEAD -> master, origin/master, origin/HEAD)
Author: Martin Maas <[email protected]>
Date:   Thu Feb 13 15:38:17 2020

    Refactor time series tracking.
    
    We are adding additional time series telemetry. To avoid duplication, this change factors out shared functionality from MinMaxTracker to be reused for the other time series trackers. It should not change any current behavior.
    
    PiperOrigin-RevId: 294989313
    Change-Id: I53e1329ef639aec9dde69d74b71e4279c76c58d8

Create a shared object artifact of tcmalloc

It appears that tcmalloc's buildsystem is currently mostly written with static linking in mind. It would be great if there where a rule/target that produces a shared object of tcmalloc (potentially including dependencies like abseil).

libtcmalloc_minimal.so is missing

I tried to compile sentencepiece locally and got the follow error. Any help are appreciated:

[ 16%] Built target sentencepiece_train-static
[ 83%] Built target sentencepiece-static
make[2]: *** No rule to make target '/usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.so', needed by 'src/spm_normalize'.  Stop.
make[1]: *** [CMakeFiles/Makefile2:104: src/CMakeFiles/spm_normalize.dir/all] Error 2

Of course, libtcmalloc_minimal.so is not in the folder of /usr/lib/x86_64-linux-gnu/. Any ideas to install it. Thx.

Add About Section (enhancement)

It would be really great if some description could be added in 'About' section since this will make it easier to get a brief idea of the project.

Thanks!

Incorrect usage of PERCPU_USE_RSEQ_ASM_GOTO macro

Hi,

I've been trying to build tcmalloc with clang 8 to no avail. The compilation error is:

tcmalloc/internal/percpu_tcmalloc.h:255:7: error: 'asm goto' constructs are not supported yet
   asm goto(

The problem seems to be the following check

#ifdef PERCPU_USE_RSEQ_ASM_GOTO
  asm goto(
#else

and specifically the fact that PERCPU_USE_RSEQ_ASM_GOTO is always defined but has different values: 1 or 0. In my case I have it defined as 0 from here:

 42
 43 #else
 44 #define PERCPU_USE_RSEQ_ASM_GOTO 0 # <<<<<<<<<<<<<<<<
 45 #endif
 46 #else
 47 #define PERCPU_USE_RSEQ_ASM_GOTO 0
 48 #endif

You can easily reproduce the issue

#include <iostream>

#if 0
#define PERCPU_USE_RSEQ_ASM_GOTO 1
#else
#define PERCPU_USE_RSEQ_ASM_GOTO 0
#endif

int
main()
{
#ifdef PERCPU_USE_RSEQ_ASM_GOTO
   std::cout << "defined, value: " << PERCPU_USE_RSEQ_ASM_GOTO << std::endl;
#endif
}

How would I build a shared object file?

Trying to get my head around bazel is there a way of building a tcmalloc.so shared object file from this project?

I know the recommendation is just to compile applications with tcmalloc directly but for my use case: https://github.com/SamSaffron/allocator_bench I would like to do a side by side comparison to perftools and jemalloc that are LD_PRELOADed.

Also not against experimenting with a statically compiled ruby including tcmalloc, if we can prove it is faster / better maybe Ruby folks would be open to adding it.

Intercepting file I/O can cause deadlock in TCMalloc

TCMalloc attempts to read files in /sys/ (via Abseil to get CPU frequency & whatnot) while holding a spinlock in Static::SlowInitIfNecessary. If run with a preloaded library that intercepts glibc I/O calls & then does a dlsym, this can cause a deadlock because the dlsym will cause TCMalloc to try to reenter it's initialization routine while holding the spinlock.

Admittedly this is grotty behavior on the part of the intercepting library (used to implement the Ekam build system). I'm going to push a PR there too to work around this. However, a simple fix here on the part of TCMalloc would be to call the Abseil functions that can read from /sys/ before entering the critical section. Since Abseil caches the values from /sys/, this would remove the need for the workaround. Avoiding doing I/O while holding critical sections is probably a good idea anyway.

Question: Is it possible to force tcmalloc to use hugepages

I mean 1GiB pages, looks like tcmalloc can use transparent huge pages out of the box. However I didnt findany evidence it can use 1GiB huge pages. In case it is not supported, is there a way just to feed the starting pointer and give it available memory size which I obtain from mmap?

Segmentation fault when building with ASan

I always get segmentation fault when building with ASan on. Here is the way to trigger the problem

bazel run --copt=-fsanitize=address --linkopt=-fsanitize=address tcmalloc/testing:hello_main

It appears the seg fault happens before the main function is called as I added a quick print as the first statement in main and it was not printed. The seg fault goes away if I remove the "malloc = "//tcmalloc" in the cc_binary target.

I use the LLVM 10 toolchain on Ubuntu 20.04. Bazel version 3.4.1. TCMalloc is at commit 65bf455.

Provide documentation about how this project differs from gperftools/gperftools

I'm really excited to see this new version of tcmalloc becoming available. In particular, the per-cpu support has long been an idea of interest. However, it is currently quite unclear how this new project compares with the existing gperftools/gperftools project. I think it would be helpful if this project contained some documentation that provided a direct comparison with the other (soon to be legacy?) project. Some roadmap and future directions content would be welcome as well. In no particular order:

  • The old gpeftools supported a wider array of platforms. On the OS side, Windows and macOS, at least to some degree. This project looks to currently be Linux only. Is support for those other operating systems planned? Explicitly out of scope? Similar questions regarding CPU. I note that ppc (presumably ppc64le?) is supported. But s390x (not surprising) and arm64 (quite surprising?) are absent. Are they on the horizon? Is work from the community to support those other platforms welcome?

  • What exactly has changed regarding support for CPU and heap profiling? It looks like they are more or less gone? Which is fine, at least for my use, I'd just like to know for sure either way.

  • Similar question regarding debugallocation. It seems that some of the classic debug allocator features that were part of gperftools may no longer be included. But at least use-after-free detection seems like it is still present, per some references to 0xcd? It probably makes sense to de-emphasize these sorts of features in world with ASAN, but some more information here would be welcome. And are there new interesting debugging features added?

  • What previously offered tunings or configurations have been removed or added?

  • What is the degree of stability of the code at this point? Should projects that have longstanding integrations with gperftoools be looking to switch now? If not, what are the gating changes?

  • Is there a release/tag/branch strategy? ABI stability goals? What should happen with packaging, especially for systems where the OS provides a "tcmalloc" package that derives from the old gperftools project?

  • What is the plan regarding synchronization between this project and the internal Google tcmalloc implementation? How open is the project to community contributions? Will those contributions be synced back to google, or will this eventually become another fork, as somewhat happened to gperftools?

I know that is a lot of questions, but I'm hopeful that putting some of the answers down in writing will help everyone who currently uses gperftools in their projects to understand how this new project should be approached.

I'd also like to thank you in advance for all the work that I am certain went into getting this new version of tcmalloc out into the world. Please don't take my long list of questions and concerns as anything other than deriving from a keen interest in the success of this new project.

release free memory on timer

I want a timer that checks every minute and executes MallocExtension::instance()->ReleaseFreeMemory(); when the tcmalloc free memory usage reaches 50%;

the api is: void SetTimerRelease(1 minute, 50% usage)
or
int GetMemoryUsage() // return value like the ’top' command %MEM on linux

Lock contention bottleneck (jemalloc vs tcmalloc)

Recently I've been playing with tcmalloc (this new version) and found out that once the size of allocation is not small (there is size class for it) it always acquire the pageheap_lock (and indeed, this is described in doc)

However this became a bottleneck with multiple threads, here is a sample that shows this, it is simply:

  • creates 16 threads
  • allocate objects from 4k to 1M (each time size of the allocation multiplied by 4)
  • tcmalloc configured with 256K pages w/o sampling
  • jemalloc uses per-cpu arena

And results (you can also find this numbers in comments):

conf real user sys
jemalloc 0m10.816s 2m24.375s 0m0.230s
tcmalloc 0m19.837s 4m32.754s 0m3.329s
jemalloc capped to 256K 0m2.567s 0m32.748s 0m0.020s
tcmalloc capped to 256K 0m2.335s 0m28.804s 0m0.010s

sys time is mostly due to futex

Plus some locking info (no need in anything better then strace, it shows the problem):

  • jemalloc:
$ time strace -qq -fefutex -c allocator-perf-jemalloc 
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.448936       11223        40           futex
------ ----------- ----------- --------- --------- ----------------
100.00    0.448936                    40           total

real    0m10.851s
user    2m27.460s
sys     0m0.767s
  • tcmalloc:
$ time strace -qq -fefutex -c allocator-perf-tcmalloc 
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00   29.896237          18   1619220    629766 futex
------ ----------- ----------- --------- --------- ----------------
100.00   29.896237               1619220    629766 total

real    0m27.448s
user    3m44.494s
sys     0m33.782s

Any plans on improving this?
Or maybe adding support for custom size classes? By providing some helpers to generate them (I can even generate them right now, with some small modifications)

tcmalloc version: 8738f27

BUILD fail

I just git clone this project and using bazel test //tcmalloc:cpu_cache_test,I meet following errors.

image

BTW, I'm using Ubuntu 18.04, gcc 5.5.0 and 0.24.1.

Calling tcmalloc using Rust FFI

To provide some context, we are building a memory management framework written in Rust. We are experimenting with different malloc/free implementations.

We tried the following two approaches of linking with tcmalloc.
First, we tried to statically link with tcmalloc. We use the libtcmalloc.lo produced by bazel build tcmalloc.
We got the following error.

Error: failed <executable> because /path/to/OurLibrary.so: undefined symbol: _ZN8tcmalloc17tcmalloc_internal10Parameters23per_cpu_caches_enabled_E

Second, we tried to dynamically link with tcmalloc. Specifically, we deleted linkstatic=1 (https://github.com/google/tcmalloc/blob/f4a573f/tcmalloc/BUILD#L91) and used libtcmalloc.so produced by Bazel.
We got the following error when run our executable linked with tcmalloc.

Error: failed <executable> because /path/to/libtcmalloc.so: undefined symbol: _ZN4absl19str_format_internal13FormatArgImpl8DispatchINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEbNS1_4DataENS0_24FormatConversionSpecImplEPv

We are just wondering what the best practice of using tcmalloc standalone is (ODR is not a big issue for us as the project is in Rust #16 #48).

cc @caizixian

releasing_test failed

The commit I use is 3dda5d0


I use gcc 7.3.1 to build and run the test:
CXX=/usr/bin/g++ CC=/usr/bin/gcc bazel --output_user_root=/data/bazel-cache test //tcmalloc/...

And there's one failure. Here's the log:

cat  /data/bazel-cache/53be17487e92ab49f9a9a0a4d546d9a6/execroot/com_google_tcmalloc/bazel-out/k8-fastbuild/testlogs/tcmalloc/testing/releasing_test/test.log
exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //tcmalloc/testing:releasing_test
-----------------------------------------------------------------------------
tcmalloc/testing/releasing_test.cc:142] Unmapped Memory [Before] 155189248
tcmalloc/testing/releasing_test.cc:144] Unmapped Memory [After ] 2535456768
tcmalloc/testing/releasing_test.cc:146] Unmapped Memory [Diff  ] 2380267520
tcmalloc/testing/releasing_test.cc:148] Memory Usage [Before] 2358042624
tcmalloc/testing/releasing_test.cc:150] Memory Usage [After ] 51367936
tcmalloc/testing/releasing_test.cc:152] Memory Usage [Diff  ] 2306674688
(after_unmapped - before_unmapped) != (before - after):18446744071794851840] 2306674688 @ 0x40e8da 0x7f94f4340c05

build failure

I'm trying to build on Debian 10, gcc 8.3.0

/usr/src/tcmalloc# bazel test //tcmalloc/...
INFO: Analyzed 134 targets (0 packages loaded, 0 targets configured).
INFO: Found 52 targets and 82 test targets...
ERROR: /usr/src/tcmalloc/tcmalloc/BUILD:300:11: C++ compilation of rule '//tcmalloc:common_deprecated_perthread' failed (Exit 1) gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer '-std=c++0x' -MD -MF ... (remaining 28 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
In file included from ./tcmalloc/huge_page_aware_allocator.h:25,
                 from tcmalloc/huge_page_aware_allocator.cc:15:
./tcmalloc/huge_page_filler.h: In instantiation of 'void tcmalloc::SkippedSubreleaseCorrectnessTracker<kEpochs>::ReportUpdatedPeak(tcmalloc::Length) [with long unsigned int kEpochs = 600; tcmalloc::Length = long unsigned int]':
./tcmalloc/huge_page_filler.h:270:9:   required from 'void tcmalloc::FillerStatsTracker<kEpochs>::Report(tcmalloc::FillerStatsTracker<kEpochs>::FillerStats) [with long unsigned int kEpochs = 600]'
./tcmalloc/huge_page_filler.h:1884:24:   required from 'void tcmalloc::HugePageFiller<TrackerType>::UpdateFillerStatsTracker() [with TrackerType = tcmalloc::PageTracker<tcmalloc::SystemRelease>]'
./tcmalloc/huge_page_filler.h:1345:3:   required from 'void tcmalloc::HugePageFiller<TrackerType>::Contribute(TrackerType*, bool) [with TrackerType = tcmalloc::PageTracker<tcmalloc::SystemRelease>]'
tcmalloc/huge_page_aware_allocator.cc:135:33:   required from here
./tcmalloc/huge_page_filler.h:89:5: error: no matching function for call to 'tcmalloc::TimeSeriesTracker<tcmalloc::SkippedSubreleaseCorrectnessTracker<600>::SkippedSubreleaseEntry, tcmalloc::SkippedSubreleaseCorrectnessTracker<600>::SkippedSubreleaseUpdate, 600>::Report(<brace-enclosed initializer list>)'
     if (tracker_.Report({.confirmed_peak = current_peak})) {
     ^~
In file included from ./tcmalloc/huge_cache.h:32,
                 from ./tcmalloc/huge_page_aware_allocator.h:24,
                 from tcmalloc/huge_page_aware_allocator.cc:15:
./tcmalloc/internal/timeseries_tracker.h:153:6: note: candidate: 'bool tcmalloc::TimeSeriesTracker<T, S, kEpochs>::Report(S) [with T = tcmalloc::SkippedSubreleaseCorrectnessTracker<600>::SkippedSubreleaseEntry; S = tcmalloc::SkippedSubreleaseCorrectnessTracker<600>::SkippedSubreleaseUpdate; long unsigned int kEpochs = 600]'
 bool TimeSeriesTracker<T, S, kEpochs>::Report(S val) {
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./tcmalloc/internal/timeseries_tracker.h:153:6: note:   no known conversion for argument 1 from '<brace-enclosed initializer list>' to 'tcmalloc::SkippedSubreleaseCorrectnessTracker<600>::SkippedSubreleaseUpdate'
In file included from ./tcmalloc/malloc_extension.h:39,
                 from ./tcmalloc/experiment.h:27,
                 from ./tcmalloc/huge_cache.h:27,
                 from ./tcmalloc/huge_page_aware_allocator.h:24,
                 from tcmalloc/huge_page_aware_allocator.cc:15:
external/com_google_absl/absl/functional/function_ref.h:101:3: error: 'absl::FunctionRef<R(Args ...)>::FunctionRef(const F&) [with F = tcmalloc::SkippedSubreleaseCorrectnessTracker<kEpochs>::ReportUpdatedPeak(tcmalloc::Length) [with long unsigned int kEpochs = 600; tcmalloc::Length = long unsigned int]::<lambda(size_t, int64_t, const tcmalloc::SkippedSubreleaseCorrectnessTracker<600>::SkippedSubreleaseEntry&)>; <template-parameter-2-2> = void; R = void; Args = {long unsigned int, long int, const tcmalloc::SkippedSubreleaseCorrectnessTracker<600>::SkippedSubreleaseEntry&}]', declared using local type 'const tcmalloc::SkippedSubreleaseCorrectnessTracker<kEpochs>::ReportUpdatedPeak(tcmalloc::Length) [with long unsigned int kEpochs = 600; tcmalloc::Length = long unsigned int]::<lambda(size_t, int64_t, const tcmalloc::SkippedSubreleaseCorrectnessTracker<600>::SkippedSubreleaseEntry&)>', is used but never defined [-fpermissive]
   FunctionRef(const F& f)  // NOLINT(runtime/explicit)
   ^~~~~~~~~~~
INFO: Elapsed time: 3.254s, Critical Path: 3.00s
INFO: 26 processes: 26 linux-sandbox.
FAILED: Build did NOT complete successfully```

Bazel build fails on Mac OS X

Hi,

I'm using Bazel to build tcmalloc on Mac OS X 10.15.6 but I get an error:

❯ bazel build "@com_google_tcmalloc//tcmalloc"
INFO: Analyzed target @com_google_tcmalloc//tcmalloc:tcmalloc (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
INFO: From Compiling external/com_google_tcmalloc/tcmalloc/internal/environment.cc:
warning: unknown warning option '-Wno-attribute-alias'; did you mean '-Wno-attributes'? [-Wunknown-warning-option]
1 warning generated.
ERROR: /private/var/tmp/_bazel_username/148f9f6ebca6e47e7d6d5ed427a82e62/external/com_google_tcmalloc/tcmalloc/internal/BUILD:200:11: C++ compilation of rule '@com_google_tcmalloc//tcmalloc/internal:mincore' failed (Exit 1): wrapped_clang failed: error executing command
  (cd /private/var/tmp/_bazel_username/148f9f6ebca6e47e7d6d5ed427a82e62/sandbox/darwin-sandbox/59/execroot/__main__ && \
  exec env - \
    APPLE_SDK_PLATFORM=MacOSX \
    APPLE_SDK_VERSION_OVERRIDE=10.15 \
    PATH=/Users/username/.local/bin:/Users/username/go/bin:/Users/username/.cargo/bin:/usr/local/opt/llvm/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/bin:/usr/sbin:/sbin \
    XCODE_VERSION_OVERRIDE=11.6.0.11E708 \
  external/local_config_cc/wrapped_clang '-D_FORTIFY_SOURCE=1' -fstack-protector -fcolor-diagnostics -Wall -Wthread-safety -Wself-assign -fno-omit-frame-pointer -O0 -DDEBUG '-std=c++11' -iquote external/com_google_tcmalloc -iquote bazel-out/darwin-fastbuild/bin/external/com_google_tcmalloc -MD -MF bazel-out/darwin-fastbuild/bin/external/com_google_tcmalloc/tcmalloc/internal/_objs/mincore/mincore.d '-frandom-seed=bazel-out/darwin-fastbuild/bin/external/com_google_tcmalloc/tcmalloc/internal/_objs/mincore/mincore.o' -isysroot __BAZEL_XCODE_SDKROOT__ -F__BAZEL_XCODE_SDKROOT__/System/Library/Frameworks -F__BAZEL_XCODE_DEVELOPER_DIR__/Platforms/MacOSX.platform/Developer/Library/Frameworks '-mmacosx-version-min=10.15' -DHAVE_BAZEL_BUILD '-fdiagnostics-color=always' '-std=c++2a' -Wall -Wreturn-type -Wuninitialized -Wunused-result '-Werror=narrowing' '-Werror=reorder' -Wunused-local-typedefs '-Werror=conversion-null' '-Werror=overlength-strings' '-Werror=pointer-arith' '-Werror=varargs' '-Werror=vla' '-Werror=write-strings' -Wmissing-declarations -Wno-attribute-alias -Wno-sign-compare -Wno-uninitialized -Wno-unused-function -Wno-unused-result -Wno-unused-variable -no-canonical-prefixes -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c external/com_google_tcmalloc/tcmalloc/internal/mincore.cc -o bazel-out/darwin-fastbuild/bin/external/com_google_tcmalloc/tcmalloc/internal/_objs/mincore/mincore.o)
Execution platform: @local_config_platform//:host

Use --sandbox_debug to see verbose messages from the sandbox wrapped_clang failed: error executing command
  (cd /private/var/tmp/_bazel_username/148f9f6ebca6e47e7d6d5ed427a82e62/sandbox/darwin-sandbox/59/execroot/__main__ && \
  exec env - \
    APPLE_SDK_PLATFORM=MacOSX \
    APPLE_SDK_VERSION_OVERRIDE=10.15 \
    PATH=/Users/username/.local/bin:/Users/username/go/bin:/Users/username/.cargo/bin:/usr/local/opt/llvm/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/bin:/usr/sbin:/sbin \
    XCODE_VERSION_OVERRIDE=11.6.0.11E708 \
  external/local_config_cc/wrapped_clang '-D_FORTIFY_SOURCE=1' -fstack-protector -fcolor-diagnostics -Wall -Wthread-safety -Wself-assign -fno-omit-frame-pointer -O0 -DDEBUG '-std=c++11' -iquote external/com_google_tcmalloc -iquote bazel-out/darwin-fastbuild/bin/external/com_google_tcmalloc -MD -MF bazel-out/darwin-fastbuild/bin/external/com_google_tcmalloc/tcmalloc/internal/_objs/mincore/mincore.d '-frandom-seed=bazel-out/darwin-fastbuild/bin/external/com_google_tcmalloc/tcmalloc/internal/_objs/mincore/mincore.o' -isysroot __BAZEL_XCODE_SDKROOT__ -F__BAZEL_XCODE_SDKROOT__/System/Library/Frameworks -F__BAZEL_XCODE_DEVELOPER_DIR__/Platforms/MacOSX.platform/Developer/Library/Frameworks '-mmacosx-version-min=10.15' -DHAVE_BAZEL_BUILD '-fdiagnostics-color=always' '-std=c++2a' -Wall -Wreturn-type -Wuninitialized -Wunused-result '-Werror=narrowing' '-Werror=reorder' -Wunused-local-typedefs '-Werror=conversion-null' '-Werror=overlength-strings' '-Werror=pointer-arith' '-Werror=varargs' '-Werror=vla' '-Werror=write-strings' -Wmissing-declarations -Wno-attribute-alias -Wno-sign-compare -Wno-uninitialized -Wno-unused-function -Wno-unused-result -Wno-unused-variable -no-canonical-prefixes -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c external/com_google_tcmalloc/tcmalloc/internal/mincore.cc -o bazel-out/darwin-fastbuild/bin/external/com_google_tcmalloc/tcmalloc/internal/_objs/mincore/mincore.o)
Execution platform: @local_config_platform//:host

Use --sandbox_debug to see verbose messages from the sandbox
warning: unknown warning option '-Wno-attribute-alias'; did you mean '-Wno-attributes'? [-Wunknown-warning-option]
external/com_google_tcmalloc/tcmalloc/internal/mincore.cc:28:36: error: cannot initialize a parameter of type 'char *' with an lvalue of type 'unsigned char *'
    return ::mincore(addr, length, result);
                                   ^~~~~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk/usr/include/sys/mman.h:243:45: note: passing argument to parameter here
int     mincore(const void *, size_t, char *);
                                            ^
1 warning and 1 error generated.
Target @com_google_tcmalloc//tcmalloc:tcmalloc failed to build
INFO: Elapsed time: 0.658s, Critical Path: 0.53s
INFO: 1 process: 1 darwin-sandbox.
FAILED: Build did NOT complete successfully

Running the same command on Clear Linux 5.7.8-968 x86_64 with gcc compiles alright without errors.

Turning off linkstatic = 1 does not produce dynamic libraries.

I followed the guidance listed here (#27) on how to build a static library. In the various BUILD files, I commented out the linkstatic = 1 lines, however, I can't find any .so files produced:

find tcmalloc/ -name *.so

Since I'm on a SLES 15 HPC cluster, I can't install packages through zypper/apt/yum (no root access) so I am not sure how I can generate a .so file.

O(n) behavior for large allocations

This re-released version of tcmalloc is missing the fix that's available in gperftools: gperftools/gperftools@06c9414

The O(n) search over large spans becomes very expensive for any long-running application with >1MB allocations, since over time the large span list can accumulate thousands of entries due to fragmentation.

debug workflow for "why is a process so big"

I want to inspect what is causing my program to use more memory than I expected. It seems gperftool is deprecated in favor of abseil's tcmalloc and the new go-based pprof.

Is there a recommended workflow for inspecting what function etc is using the memory, like HeapProfilerStart() or env HEAPPROFILE=/tmp/mybin.hprof in gperftools, some flag, /heapz, etc? It seems to that I can call absl::MallocExtension::SnapshotCurrent(), but it returns tcmalloc::Profile -- is there a way to convert it to profile.proto which I assume is what the new pprof needs?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.