Erlang and its consumption of Heap Memory

I have been running a highly concurrent application on my HP Proliant Servers. The application is a file system indexer i coded in erlang. It spawns a process per Folder it finds on the file system and records all file paths in a fragmented Mnesia Database. (Database consists of disc_only_copies type of tables and a screen shot of its file system can be viewed here.)

The Snippet of code that does the high intensive job of going through the file system is shown below:


%%% -------- COPYRIGHT NOTICE --------------------------------------------------------------------
%% @author Muzaaya Joshua, <joshmuza@gmail.com> [http://joshanderlang.blogspot.com]
%% @version 1.0 free software, but modification prohibited
%% @copyright Muzaaya Joshua (file_scavenger-1.0) 2011 - 2012 . All rights reserved
%% @reference <a href="http://www.erlang.org">OpenSource Erlang WebSite</a>
%% 
%%% ---------------- EDOC INTRODUCTION TO THE MODULE ----------------------------------------------
%% @doc This module provides the low level APIs for reading, writing,
%% searching, joining and moving within directories.The module implementation
%% took place on @date at @time.
%% @end

-module(file_scavenger_utilities).

%%% ------- EXPORTS -------------------------------------------------------------------------------
-compile(export_all).

%%% ------- INCLUDES -----------------------------------------------------------------------------

%%% -------- MACROS ------------------------------------------------------------------------------
-define(IS_FOLDER(X),filelib:is_dir(X)).
-define(IS_FILE(X),filelib:is_file(X)).
-define(FAILED_TO_LIST_DIR(X),error_logger:error_report(["*** File Scavenger Utilities Error ***** ",{error,"Failed to List Directory"},{directory,X}])).
-define(NOT_DIR(X),error_logger:error_report(["*** File Scavenger Utilities Error ***** ",{error,"Not a Directory"},{alleged,X}])).
-define(NOT_FILE(X),error_logger:error_report(["*** File Scavenger Utilities Error ***** ",{error,"Not a File"},{alleged,X}])).
%%%--------- TYPES -------------------------------------------------------------------------------

%% @type dir() = string(). 
%%  Must be containing forward slashes, not back slashes. Must not end with a slash
%%  after the exact directory.e.g this is wrong: "C:/Program Files/SomeDirectory/"
%%  but this is right: "C:/Program Files/SomeDirectory"
%% @type file_path() = string(). 
%%  Must be containing forward slashes, not back slashes.
%%  Should include the file extension as well e.g "C:/Program Files/SomeFile.pdf"

%% -----------------------------------------------------------------------------------------------
%% @doc Enters a directory and executes the fun ForEachFileFound/2 for each file it finds
%% If it finds a directory, it executes the fun %% ForEachDirFound/2. 
%% Both funs above take the parent Dir as the first Argument. Then, it will spawn an 
%% erlang process that will spread the found Directory too in the same way as the parent directory 
%% was spread. The process of spreading goes on and on until every File (wether its in a nested 
%% Directory) is registered by its full path.
%% @end
%%
%% @spec spread_directory(dir(),dir(),funtion(),function())-> ok.

spread_directory(Dir,Top_Directory,ForEachFileFound,ForEachDirFound) when is_function(ForEachFileFound),is_function(ForEachDirFound) ->
    case ?IS_FOLDER(Dir) of
        false -> ?NOT_DIR(Dir); 
        true -> 
            F = fun(X)->
                    FileOrDir = filename:absname_join(Dir,X),
                    case ?IS_FOLDER(FileOrDir) of
                        true -> 
                            (catch ForEachDirFound(Top_Directory,FileOrDir)),
                            spawn(fun() -> ?MODULE:spread_directory(FileOrDir,Top_Directory,ForEachFileFound,ForEachDirFound) end);
                        false -> 
                            case ?IS_FILE(FileOrDir) of
                                false -> {error,not_a_file,FileOrDir};
                                true -> (catch ForEachFileFound(Top_Directory,FileOrDir))
                            end
                    end
                end,
            case file:list_dir(Dir) of      
                {error,_} -> ?FAILED_TO_LIST_DIR(Dir);
                {ok,List} -> lists:foreach(F,List)
            end
    end.    

The function spread_directory/4 is generic in a way that it takes two funs . One fun: ForEachFileFound/2 takes along with the Top Most Directory, the found file and does anything with it and the other fun: ForEachDirFound/2 takes along with the Top Most Directory, the folder it finds and uses it in any way it wants.

The start script i use for this application makes sure that erlang will be able to spawn as many processes as possible. Once a process finishes indexing a folder it exits.

#!/usr/bin/env sh
echo "Starting File Scavenger System. Layer 1 on the P2P File Sharing System....."
erl 
    -name file_scavenger@127.0.0.1 
    +P 13421779 
    -pa ./ebin ./lib/*/ebin ./include 
    -mnesia dir '"./database"' 
    -mnesia dump_log_write_threshold 10000 
    -eval "application:load(file_scavenger)" 
    -eval "application:start(file_scavenger)"

There is a gen_server which interfaces the intensive module with the database in which i record all paths. A snippet of where it starts the spread_directory work is shown here below:

handle_cast(index_dirs,#scavenger{directory_paths = Dirs} = State)->
    {File,Folder} = case {State#scavenger.verbose,State#scavenger.verbose_to} of
                        {true,tty} -> 
                            {
                            fun(TopDir,Fl)-> 
                                io:format(" File: ~p~n",[Fl]),
                                file_scavenger_database:insert_file(filename:basename(Fl),file,Fl,TopDir,filename:extension(Fl))
                            end,
                            fun(TopDir,Fd) -> 
                                io:format(" Folder: ~p~n",[Fd]),
                                file_scavenger_database:insert_file(Fd,folder,Fd,TopDir,undefined)
                            end
                            };
                        {true,SomeFile}-> 
                            {
                            fun(TopDir,Fl)-> 
                                os:cmd("echo File: " ++ Fl ++ " >> " ++ SomeFile),
                                file_scavenger_database:insert_file(filename:basename(Fl),file,Fl,TopDir,filename:extension(Fl))
                            end,
                            fun(TopDir,Fd)-> 
                                os:cmd("echo Folder: " ++ Fd ++ " >> " ++ SomeFile),
                                file_scavenger_database:insert_file(Fd,folder,Fd,TopDir,undefined)
                            end
                            }                       
                    end,
    Main = fun(Dir) -> 
                error_logger:info_msg("*** File scavenger Server indexing directory: ~p~n",[Dir]),
                spawn(fun() -> file_scavenger_utilities:spread_directory(Dir,Dir,File,Folder) end)
            end,
    lists:foreach(Main,Dirs),
    {noreply,State};    
handle_cast(stop, State) -> {stop, normal, State}.

More Source details can be found in the whole application. The application entire Source and build can be found here: File_scavenger-1.0.zip.

Now, i start the application on the Server (HP Proliant G6, containing Intel processors (2 processors, each 4 cores, 2.4 GHz speed each core, 8 MB Cache size), 20 GB RAM size, 1.5 Terabytes disk space. Now, 2 of these high power machines are in our disposal. System Database should be replicated across the two. Each server runs Solaris 10, 64 bit), whose terminal now looks like this below:

bash-3.00# sh file_scavenger.sh
Starting File Scavenger System. Layer 1 on the P2P File Sharing System.....
Erlang R14B03 (erts-5.8.4) [source] [smp:8:8] [rq:8] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.8.4  (abort with ^G)
(file_scavenger@127.0.0.1)1>
=INFO REPORT==== 18-Aug-2011::09:36:04 ===
Starting File Scavenger Database......
=INFO REPORT==== 18-Aug-2011::09:36:04 ===
Database Successfully Started....

=INFO REPORT==== 18-Aug-2011::09:36:04 ===
Starting File Scavenger Database......
=INFO REPORT==== 18-Aug-2011::09:36:04 ===
Database Successfully Started....

=INFO REPORT==== 18-Aug-2011::09:36:04 ===
File Scavenger Server starting with default verbose settings....

(file_scavenger@127.0.0.1)1> file_scavenger_server:index_dirs().

The server starts to run and verboses to the terminal all files and folders it finds. The server is equipped with too much RAM (20 GB), and Swap space (Swap is 16 GB). However, it ran for about 18 hours and finally, the erlang Virtual machine reported this:

File: "/proc/4324/root/opt/csw/gcc4/share/locale/ja/LC_MESSAGES/gcc.mo"
 Folder: "/proc/4324/root/opt/csw/gcc4/share/locale/da"
 Folder: "/proc/4324/root/opt/csw/gcc4/share/locale/es/LC_MESSAGES"
 File: "/proc/4324/root/proc/4984/root/.thumbnails/normal/dc259e3897e8af4b379c6d956b6c1393.png"
 File: "/proc/4324/root/proc/4984/root/.thumbnails/fail/gnome-thumbnail-factory/223c19786421b7101d14075bdec46f61.png"
 File: "/proc/4324/root/opt/csw/gcc4/libexec/gcc/i386-pc-solaris2.10/4.5.1/install-tools/mkheaders"
 File: "/proc/4324/root/opt/csw/gcc4/libexec/gcc/i386-pc-solaris2.10/4.5.1/cc1plus"
 File: "/proc/4324/root/opt/csw/gcc4/lib/libsupc++.la"

Crash dump was written to: erl_crash.dump
eheap_alloc: Cannot allocate 153052320 bytes of memory (of type "heap").
Abort - core dumped
bash-3.00#

Question 1. With such a powerful server, why would the operating system fail to provide such memory to the application (it was the only application running)?

Question 2. The Erlang Emulator i start is instructed to be able to spawn as many processes as it may need. the value +P 13421779 . Is Erlang VM failing to access this memory or failing to allocate it to its processes ?

Question 3. To Solaris, it sees one process: epmd , perhaps containing and starting thousands of micro threads. What configurations can i make to Solaris to be able to never stop my application however much "memory hungry" it may be? Swap space available is 16 GB, RAM 20 GB, honestly, there must be something wrong.

Question 4. Which configurations can i make to the Erlang Emulator, to avoid these heap memory crash dumps especially when all the memory it may need is available on the server? How will i run more memory consuming apps on this server if Erlang still fails to allocate such memory to a simple file system indexer (well its heavily concurrent)?

finally, all other tweaks i could do to avoid heap memory problems on such capable hardware are welcome. Thanks in advance


I haven't had time to look at the source, but here are some comments:

Question 1. With such a powerful server, why would the operating system fail to provide such memory to the application (it was the only application running)?

Because the Erlang VM tried to consume more than the available free memory.

Question 2. The Erlang Emulator i start is instructed to be able to spawn as many processes as it may need. the value +P 13421779. Is Erlang VM failing to access this memory or failing to allocate it to its processes ?

No. If you would have run out of processess, the Erlang VM would have said so (and the VM would still be up and running):

=ERROR REPORT==== 18-Aug-2011::10:04:04 ===
Error in process <0.31775.138> with exit value: {system_limit,[{erlang,spawn_link,    [erlang,apply,[#Fun<shell.3.130303173>,[]]]},{erlang,spawn_link,1},{shell,get_command,5},    {shell,server_loop,7}]}

Question 3. To Solaris, it sees one process: epmd, perhaps containing and starting thousands of micro threads. What configurations can i make to Solaris to be able to never stop my application however much "memory hungry" it may be? Swap space available is 16 GB, RAM 20 GB, honestly, there must be something wrong.

epmd is the Erlang port mapping deamon. It's responsible for managing distributet Erlang and has nothing to with your individual Erlang application. The processes you should look for will be name beam.smp most likely. These will show the OS memory consumption of the Erlang VM etc.

Question 4. Which configurations can i make to the Erlang Emulator, to avoid these heap memory crash dumps especially when all the memory it may need is available on the server? How will i run more memory consuming apps on this server if Erlang still fails to allocate such memory to a simple file system indexer (well its heavily concurrent)?

The Erlang VM should be able to use all of the available memory in your machine. However, it depends on how your application is written. There can be many reasons for memory leaks:

  • Atom table filling up (you create too many unique atoms)
  • ETS or Mnesia tables are not garbage collected (you do not delete old unused elements)
  • Not enough memory for processes (you spawn too many processess)
  • Too many binaries are created (you might keep unused references to old binaries)
  • 链接地址: http://www.djcxy.com/p/66282.html

    上一篇: 评估Erlang递归函数(为什么这个工作)

    下一篇: Erlang及其对堆内存的消耗