boost::process Porting Notes

Keywords: C++ boost


Recently, I wanted to be a process manager, because I was learning boost during this time, so I chose boost::process, but after a while I found that the compilation error under VC2010.
The reason is that many C++11 grammars are used in process, and 2010 support is incomplete.
So, here are the transplant notes.
Don't tell me that VC2019 can be perfectly supported. I just think VC2010 is fresh and concise enough not to start so many messy processes, waste resources, haha.
In fact, it is only by repeating the wheel that we can truly learn the design concept of other people's code. It is estimated that this is the real power of fumble.
The ported code is dropped on github and the address is found at the end.

process porting notes

Metafunction

Metafunctions are interesting to see in the boost::mpl documentation and are heavily used by process, so it is necessary to pave the way for the following to unfold.

Before understanding metafunctions, it may be easier to understand the concept of metafunctions by looking at what a regular function in C++ looks like, a combination of the two.

Basic elements of general functions

int MyFunc(int fld1)
{
    //Do something meaningful that suits your needs

    return 111;
}

A function named MyFunc is defined above and contains three important parts:

  • Entry
  • Function Body
  • Return results

The function of MyFunc is to add the input to fld1 and after some processing, get the desired result 111.

Examples of metafunction usage scenarios

To get a first impression of metafunctions, let's first look at a real-world scenario.

Imagine having a communication module responsible for sending local data to remote hosts. In this case, we should consider the following issues:

  • For integer types, the byte order of different machines is different, some are small-end, some are large-end. When transferring, the big-end is used uniformly.
  • String, we want to add a length indicator for the receiver to calculate the actual length of the string
  • Floating point numbers, which are represented differently on different machines, are converted to integers and sent

With object-oriented design, we naturally think of encapsulating different types into different classes. For example, CInt is responsible for converting integers to large and small ends, CStr for string processing, and CFloat for converting floating-point numbers to 64-bit integers.

Next, we need to define a Send function that is responsible for sending data:

template<typename T>
int Send(const T &t)
{
    MyFunc<T> s(t); //Note that the code here is illegal, C++ does not support this syntax, compilation will error

    //Send the converted s to s
}

The goal of the above code is to:

  1. Input T corresponds to the actual type of int, char *, float, and so on.
  2. Send first converts t to the corresponding instances of CInt, CStr, CFloat, and then sends the converted data.

The key point here is the type declaration for s. If there is a MyFunc-like "function" that converts the input type T to the CInt, CStr, and so on we want, the above code will be compiled.

Metafunction Definition

There is no declarative syntax for metafunctions in C++. In mpl, it is implemented by borrowing templates.

template<typename fld1> //Metafunction Input
struct MyFunc
{
    typedef ...... var1; //You can use typedef to define some variables
    //std and mpl provide some standard templates for similar conditional judgments, loops, and so on

    typedef ...... type; //This type is the result of a metafunction, which is the Convention in mpl
};

Inside the'{}'template class MyFunc, it is equivalent to the function body of a metafunction, the template parameter is the parameter of the metafunction, and the type is the parameter.

And the Send function we wrote earlier looks like this:

template<typename T>
int Send(const T &t)
{
    MyFunc<T>::type s(t); //Here's the difference

    //Send the converted s to s
}

In the real world, metafunction implementations often use more than one template definition, but partial template specialization techniques, which are implemented in multiple templates. Code is often scattered across multiple files, and metafunctions are more of a conceptual abstraction.
As shown in the following example:

//General Statement
template<typename fld1>
struct MyFunc
{
};
//Specialized versions for each type
template<>
struct MyFunc<int>
{
    typedef CInt type;
};
template<>
struct MyFunc<char *>
{
    typedef CStr type;
};
template<>
struct MyFunc<float>
{
    typedef CFloat type;
};

child class

child is one of the most central and interesting classes of this migration.
Define a function in C++, the number of parameters is fixed, and the order of entry must be exactly the same as the definition.
However, Childs are very "alternative" and do not follow this basic principle, so they can be constructed with arbitrary parameters.
I've only seen this in scripting languages.

To achieve this, process accomplishes this by using the following special constructor:

// <boost/process/detail/child_decl.hpp>
class child
{
public:
    template<typename ...Args>
    explicit child(Args&&...args);
}

// <boost/process/child.hpp>
template<typename ...Args>
child::child(Args&&...args)
    : child(::boost::process::detail::execute_impl(std::forward<Args>(args)...)) {}

Core logic in execute_ Impl This function, located in'<boost/process/detail/execute_impl.hpp>', simply converts the following string, converting all ansi strings to unicode strings as long as the input parameter contains a Unicode string. Then, forward the converted parameter to basic_execute_impl handles, and the core logic of child construction is in this function.

child construction process

In Windows, the API to create the process is CreateProcess.
This function can be said to be quite complex, there are 10 input parameters, of which lpStartupInfo is the structure with 18 member variables, so 27 elements need attention.

Process categorizes the 27 features into functional similarities, with each category defining a class to process (all classes inherit from handler_base).
For example, working with mirror file paths, command line arguments, using class exe_cmd_init, working directory settings using class start_dir_init, async_for asynchronous piping Pipe_ In, async_pipe_out and so on.

basic_execute_impl first classifies the participation into two categories:

  1. Original types such as strings
  2. async_pipe_in class from handler_base Inheritance Type

All that follows is actually converting the first class into handler_ The subclass of base, then recombines with the second class input into a new array and gives it to the executor class to create the process.

make_builders_from_view

make_builders_from_view is a metafunction that converts the first class into a set collection of builder classes.
The conversion process is divided into two steps, starting with initializer_tag metafunction finds tag identity class,
Then, with this identity class, use initializer_builder metafunction, find the corresponding builder class.

append_set

Multiple entries may correspond to the same builder class, make_builders_from_view needs to provide the ability to merge the same classes when processing.
Since no similar functionality was found in boost::fusion, the as_vector, wrote this append_set metafunction.
This metafunction has two parameters: a set of types of type, and a new type.
Append_if the new type is not in the set Set adds a new type to the set and gets a new set.

The key technique used is for boost::fusion::result_ Of:: use of size and iterator.
Use size to get the actual number of types in the original set, expand the type through an iterator, and then use as_with the new type The set metafunction constructs a new set.
Template specialization is used in type expansion, which is a bit cumbersome to write about.
The implementation can be much simplified if the compiler supports variable template parameters.

By append_ Once set has an integrated collection of builder classes, the next step is to pass in the parameter values to these builder classes.

get_initializers imitation function

With a collection of builder classes, use for_in boost::fusion The each algorithm, which can be understood by referring to the same name function in STL, creates our final result with the builder class: handler_base subclass.
The analog function used in the algorithm is get_initializers, where in addition to overloading the "operator ()" operator, you also need to define the return value type result of the "operator ()" operator.

For the result metafunction, I originally wrote the following:

struct get_initializers
{
    template<typename Element> //Element here, which I thought was the builder type I got earlier, is actually misinterpreted
    struct result
    {
        typedef typename get_initializers_result<Element>::type type;
    };

    template<typename Element>
    typename result<Element>::type
    operator ()(Element &e) const
    {
        return e.get_initializer();
    }
};

The above code can't be compiled so that we can't step through the problem.
To troubleshoot the boost::fusion problem, be sure to learn to read the compiler's error output, as in the following example:

D:\boost4ets\boost_1_64_0\boost/utility/detail/result_of_iterate.hpp(160): See Instantiating a compiling class template boost::tr1_result_of<F>"References
with
[
    F=ets::process::detail::get_initializers (ets::process::detail::error_builder)
]
D:\boost4ets\boost_1_64_0\boost/fusion/view/transform_view/detail/apply_transform_result.hpp(31): See Instantiating a compiling class template boost::result_of<F>"References
with
[
    F=ets::process::detail::get_initializers (ets::process::detail::error_builder)
]

This is the actual output of VC2010, the higher the error, the closer down in the call stack (from the point of view of pushing parameters from the top to the bottom of the stack).
Errors are divided into three parts:

  1. D:boost4etsboost_1_64_0boost/fusion/view/transform_view/detail/apply_transform_result.hpp(31): Error file name and line number
  2. Boost::result_ Of <F>: Definition of the type of problem
  3. F=ets::process::detail::get_initializers (ets::process::detail::error_builder): corresponds to the template entry in Part 2

In fact, the error message above already tells us the cause of the error. The problem is apply_ Transform_ On result, let's first look at its definition:

// <boost/fusion/view/transform_view/detail/apply_transform_result.hpp>
template <typename F>
struct apply_transform_result
{
    template <typename T0>
    struct apply<T0, void_>
        : boost::result_of<F(T0)> //The key point is here
    {};
};

The key point is in result_ On the template parameter of of of, here is actually a "function type", whose input type is T0 and return value type is F.
Ultimately, this function type is passed to get_initializers::result, as its Element parameter.
In the implementation of result, what we actually want is the input type T0 in the function type, not the function type itself.
Therefore, to specialize the implementation of the result, expand the function type:

struct get_initializers
{
    template<typename F>
    struct result;

    template <typename Element>
    struct result<get_initializers(Element)> //With template specialization, expanding the function type, Element is the builder type we really want
    {
        typedef typename get_initializers_result<Element>::type type;
    };
};

Although the compiler's error output can provide a lot of information for troubleshooting, in the actual development, there have been cases where the information output is incomplete (or even incomplete). If you have a better troubleshooting method, you are welcome to give some advice.
In addition, boost provides a template function type_id, can get type information at runtime, but only if the program compiles properly and feels limited when developing under mpl framework.

executor class

The last step in process creation is actually to get the handler_ Arrays of base subclasses, one by one calling interface functions, assign values to 27 elements of the CreateProcess function. The process is as follows:

  1. Call on_ Setup_ Parameterized from handler_base subclass, moved inside executor
  2. Call CreateProcess to create a process
  3. Notify each handler of the process creation results Base subclass (see call source for on_success_t, on_error_t)

boost::fusion::transform_view caused pit

transform_view can logically be viewed as a new sequence container, but the values in the container are created only when they are actually used and released immediately after use (equivalent to using temporary variables).
Cmd_in executor Line, work_ Variables such as dir are defined as pointer types, which result in traversing transform_ When elements in the view point to the addresses of a temporary variable, these addresses become invalid after the traversal, causing problems in subsequent process creation.
This code migration is done by transforming_ Beyond view, add as_layer The vectors call to cache the results of the transformation to avoid this problem.

Sample Code

#define BOOST_ASIO_HAS_MOVE 1 //boost::asio seems to have a problem identifying the version referenced by the right value, VC2010 seems to support it, turn it on manually here

#include <boost/asio.hpp>
#include <process/async_pipe.hpp>
#include <process/child.hpp>
#include <process/search_path.hpp>
#include <process/io.hpp>

int main(int argc, char* argv[])
{
    boost::asio::io_service ios;
    std::vector<char> buf(100);

    ets::process::async_pipe ap(ios);

    ets::process::child c(ets::process::search_path("cmd"), "/?", ets::process::std_out > ap);

    boost::asio::async_read(ap, boost::asio::buffer(buf),
        [&buf](const boost::system::error_code &ec, std::size_t size){
            std::cout << std::string(buf.begin(), buf.begin() + size);
        });

    ios.run();
    c.wait();
    int result = c.exit_code();

    return result;
}

The sample code chooses the asynchronous IO sample from the official process tutorial, with some adjustments to accommodate VC2010:

  1. Define BOOST_before referencing any boost header file ASIO_ HAS_ MOVE macro, because boost's judgment on right value reference support for VC2010 is not correct and needs to be corrected manually
  2. boost4ets puts all the migrated code in the ETS namespace, but asynchronous IO-related functionality is still in the boost namespace, so be careful to distinguish between them when using
  3. Normally there is no g++ under windows, so I will change the startup program to cmd
  4. The official sample does not set the cache size, nor outputs the execution result, so it adds logic to the output of the result (only the first 100 bytes are output)

Compile

Since processes use libraries such as filesystem that need to be compiled, boost needs to be compiled first. Refer to the official documentation for the process, and this note will not be repeated.
The boost version recommends 1.64, which is the migration I completed on this version. Other versions are not guaranteed to be compatible.
Official download address

Compile-time directory structure:

boost4ets
|__boost_1_64_0
|  |__boost # boost source
|  |__libs
|  |__tools
|__fusion
|__out
|  |__lib # lib library file path generated after boost compilation
|__process # process source for boost4ets customized version
|__test
   |__asyn_io.cpp # Sample source for this test
  • boost_1_64_0 is the downloaded boost_1_64_0.7z decompressed directory
  • out is the path to the compiled library file as boost
  • fusion, process, test correspond to the code base

Open the command line compilation environment for VC2010, switch the current path to boost4etstest, and enter the following command:

cl /I"..\boost_1_64_0" /I".." /D "WIN32" /D "NDEBUG" /MD /EHsc asyn_io.cpp /link /LIBPATH:"..\out\lib" "shell32.lib"

If compilation is successful, the file asyn_will be generated in the current directory Io.exe, executed with the following output:

D:\boost4ets\test>asyn_io.exe
 start-up Windows A new instance of the command interpreter

CMD [/A | /U] [/Q] [/D] [/E:ON | /E:OFF] [/F:ON | /F:OFF] [/V

A personal question, if there are any errors, you are welcome to correct it

boost::fusion algorithm

For make_ Builders_ From_ Use filter_directly in view I don't understand algorithms like if (the official boost::process approach).
Many of the algorithms in fusion are essentially just a wrapping of the view, but they add a side effect by constraining containers.
The following is filter_ Code snippet for if:

// <boost/fusion/algorithm/transformation/filter_if.hpp>
template <typename Pred, typename Sequence>
BOOST_CONSTEXPR BOOST_FUSION_GPU_ENABLED
inline typename result_of::filter_if<Sequence const, Pred>::type
filter_if(Sequence const& seq) //Constant limit added here
{
    return filter_view<Sequence const, Pred>(seq);
}

When used later in the executor class, because many handlers_ On_of the base subclass The setup declaration does not add const, so it should cause compilation errors before it is correct.

Strange design of pipes

Async_ When pipe is constructed, two pipes are created_ source,_ sink, the former for reading and the latter for writing, so I naturally wrote the code as follows:

boost::asio::io_service ios;
ets::process::async_pipe ap(ios);

bp::child c("cmd.exe", ets::process::std_in < ap, ets::process::std_out > ap); //Redirect both standard input and standard output to ap

I meant to read the output of a process and wait for my operation on the input stream when the process starts, but in fact, when the process starts, there is no output and the process flies back.

By looking through the code, we found that in async_pipe_in (handler_base subclass for async_pipe, with similar logic for async_pipe_out), there is a strange handling, its on_ The success event will turn off the handle to the output stream, which is probably due to async_pipe_in only processes the input, so the output stream is meaningless and is deleted.

But why in async_since only one pipe is used What about creating two handles in pipe at the same time?

Char_ Converter_ Discussion on the Realization of T

This class is used for character type conversion, in execute_ Used in impl, when I first migrated code, I deleted this class, so long as the character set is not mixed, there is no problem.
But at the time of writing this article, search_was used in the sample code as well. Path and single-byte string, compilation error, so add this feature back.

The code is implemented using the "transform_view +mimic function call_char_converter", but a problem encountered in the implementation is how to define a result?
Because char_converter(char_converter_t is essentially just an alias definition for char_converter) has only one function conv definition. The initial idea was whether decltype could be used to derive from the function return value with the following code:

template<typename Char>
struct call_char_converter
{
    template<typename F>
    struct result;

    template <typename Element>
    struct result<call_char_converter(Element)>
    {
        typedef typename std::remove_cv<typename std::remove_reference<Element>::type>::type res_type;
        typedef decltype(ets::process::detail::char_converter<Char, res_type>::conv(res_type())) type;
    };

    template<typename Element>
    typename result<call_char_converter(Element &)>::type
        operator ()(Element &e) const
    {
        return ets::process::detail::char_converter<Char, Element>::conv(e);
    }
};

Unfortunately, compilation errors occur because of the following reasons: conv(res_type()) line, for example, no default constructor, array type cannot be constructed, reference type cannot be constructed, and so on.

Feeling like the decltype doesn't work, I've taken a foolish approach for each char_converter, which defines a metafunction of type conversion (actually refers to the way transform_view defines a function-like).

The new code is as follows:

    template <typename Element>
    struct result<call_char_converter(Element &)>
    {
        typedef typename ets::process::detail::char_converter<Char, Element> res_char_converter;
        typedef typename boost::detail::tr1_result_of_impl<
            res_char_converter,
            Element,
            boost::detail::has_result_type<res_char_converter>::value
        >::type type; //This line means if res_ Char_ result_in converter Type definition, use it as return type, otherwise use res_ Char_ result metafunction processing in converter
    };

Here is char_ Implementation of converter:

template<>
struct char_converter<wchar_t, const char*>
{
    typedef std::wstring result_type; //Add this line

    static std::wstring conv(const char* in)
    {
        std::size_t size = 0;
        while (in[size] != '\0') size++;
        return ::ets::process::detail::convert(in, in + size);
    }
};

Deep char_converter

For char_ Thoughts on Version of Common Implementation of Converter
template<typename Char, typename T>
struct char_converter
{
    static T&  conv(T & in)
    {
        return in;
    }
    static T&& conv(T&& in)
    {
        return std::move(in);
    }
    static const T&  conv(const T & in)
    {
        return in;
    }
};

I think there are two problems with this implementation:

  1. For a template parameter that references the function in question, const is passed to T, so the implementation of Version 1 and Version 3 is a bit duplicated
  2. For version 2, if the incoming argument is a left value, the right-value reference to T& will degenerate to T&, resulting in redefinition with version 1

Therefore, in the migrated code, I deleted the implementations of Version 2 and Version 3 directly, leaving only Version 1, and I have not found any problems yet.

For the derivation of parameter types for function templates, I refer to an article on the Internet at the address of Type Derivation of C++>

Why there is no special version of basic_native_environment

For the processing of environment variables, two classes are provided: basic_native_environment and basic_environment.

char_converter has a special version for basic_environment, but does not provide basic_native_environment.
Initially, the relationship between these two classes was misinterpreted, so it is not very understood that process only provides a version of the specialization.
You don't know until you look at the implementation code:

  • basic_native_environment is an environment variable used to manipulate the current process, which is equivalent to wrapping system API s such as GetEnvironmentStrings, SetEnvironmentVariable s, through which once the value of an environment variable is modified, it actually affects the execution of the current process
  • basic_environment can be understood as an array of key-value pairs that modify without affecting the current process. When creating a child process, you should use this class to specify new environment variables for the child process

When using, you can use basic_first Native_ Environment gets the value of the current process environment variable and imports it to basic_ In the environment, modify some of the values as needed, then use the modified basic_environment creates a new process.

initializer_tag

char sExec[] = "cmd";
ets::process::child c(sExec);

The code above compiles with an error that prompts error C2027: to use the undefined type "ets:: process:: detail:: initializer_tag<T>".
The error is that sExec is of type char (&) [4], while initializer_tag only defines a constant version of the character array:

template<std::size_t Size> struct initializer_tag<const char    [Size]> { typedef cmd_or_exe_tag<char>     type;};

From this point of view, it seems that version 1.64 of the process library also has something to improve on.

Related Links

Posted by woocha on Tue, 09 Nov 2021 08:12:48 -0800