pirika logo

ホームページ Pirikaで化学 ブログ 業務リスト お問い合わせ
Pirikaで化学トップ 情報化学+教育 HSP 化学全般
情報化学+教育トップ 情報化学 MAGICIAN MOOC プログラミング

A MAGICIAN is a person who can associate Materials Genome, Materials Informatics, Chemo-Informatics and Networks.
MAGICIANとは、材料ゲノム(Materials Genome)、材料情報学(Materials Informatics)、情報化学(Chemo-Informatics)とネットワーク(Networks)を結びつけて(Associate)いかれる人材です。

MAGICIAN Training Course > Lecture materials > Formulation top page > Design of paint formulations 塗料の配合処方設計


MAGICIAN(MAterials Genome/Informatics and Chemo-Informatics Associate Network)Training Course


MIRAI (Multiple Index Regression for AI)

How to use the analysis tool MIRAI.


The basic specifications of MIRAI have already been explained.

In this section, we will introduce an example of using MIRAI to design formulation for paints.

Example of data to be analyzed


Consider a powder coating of a mixture of fluoropolymers and hydrocarbon polymers. Fluoropolymers provide coatings with excellent weather resistance, but they are expensive.

If the coating looks like A, there is a risk of peeling at the boundary between the fluoropolymer (green) and the hydrocarbon polymer (blue).

When it comes to coatings like C, for the large amount of expensive fluoropolymers used, the hydrocarbon polymers that are part of the surface cause a deterioration in performance.

As in the coating of B, the fluoropolymer is unevenly distributed on the coating surface and is responsible for weather resistance, while the composition gradually changes at the polymer boundary, making delamination difficult to occur.

Dainippon Paint's patent (JP WO2013/186832) realizes this kind of powder coating.
大日本塗料の特許(JP WO2013/186832)はこの様な粉体塗装を実現する。

In this patent, the following ingredients will be included.

Polyester resin B and acrylic resin B are on the table, but there are no experiments in which they were used.

The following are the evaluations of the created coating. Five of them will be analyzed in this section.

Let's copy all the following data and paste it into a spreadsheet.

The meaning of the data is as follows.

Type in the data for the patent.

Once that is done, the first step is to separate the data for training and prediction. To do that, I generate a random number in the last column of the table and sort it. The last few rows are then used as the prediction data. The objective variable for the predictive data should be set -0.9876.

If the training dataset has no values, but the prediction dataset has values, they need to be replaced. (MIRAI can't give an answer for something MIRAI didn't learn.)

The dataset for prediction does not have to be included in the training, but when it is, the value should be set to -0.9876.

MIRAI uses multiplication of exponential functions to create a physical property estimation formula.

(linear function)exponential

It is rare that a linear function created with training data becomes negative in parentheses when it is applied to prediction data.

To avoid this breakdown, it is preferable to include all outlines.

Finally, divide the similar explanatory variables into the same group. In this case, I will use B1 and B2 for the hydrocarbon polymers and C1, C2, and C3 for the three curing agents in the same group.

Since the data is pasted into the web app in a tab-delimited format, it is convenient to organize the data in Excel.

Then copy&paste the main table and groupings into MIRAI, and click the MIRAI Read button.
そしてメインテーブルとグループ分けをMIRAIにコピー&ペーストし、MIRAI Read ボタンをクリックする。

Then, when you want to retrieve the calculation results, press the STOP button. Then copy the formula spit out by MIRAI and paste it into the second row of Excel. In order to create this formula correctly, each row and column needs to be controlled reliably.

Do not change the title in the first row and first column, and the objective variable in the second row.

After STOP, press the START button to continue the calculation. Continue the calculation until the values converge.

If the physical properties of the target are numerical values, use the numerical values as they are, and if they are ◯X△, change them to numerical values as appropriate.

Press the STOP button to abort the calculation. Copy the formula after =. Paste it into the appropriate second line of Excel. Spread the calculated values over the whole, and change the columns for the predicted values. Then graph it. Continue the calculation until the values no longer change.

To compare with MIRAI, I perform the usual multiple regression calculation.

When you do a normal multiple regression, you don't include the predicted data, so open the Cross Term MR in YSB and paste the dataset into it. Press the Cross Term button and the result will be displayed in Result. CT=0 is the result of normal multiple regression, so put it back into Excel and make a graph.
通常の重回帰計算するときは、予測データは含めない。YSBのCross Term MRを開きそこにデータセットをペーストする。Cross Termボタンを押すとResultに結果が表示される。CT=0が通常の重回帰の結果なので、それをエクセルに戻しグラフを作成する。

In the same way, I calculate the other four properties.

Only the Res1:specular gloss is sufficiently descriptive and predictive for multiple regression (MR) methods.

Other than Res1, MIRAI is clearly superior to the multiple regression (MR) method.

Once such a highly accurate formula can be constructed, the optimal formula can be automatically designed by simulation alone.

This is why MIRAI (Multiple Index Regression for AI) is an excellent inference formula for AI.
MIRAI (Multiple Index Regression for AI)がAI用の推算式として優れている所以である。

Coating Analysis Example 2


Aqueous pigment dispersions and their manufacturing methods

DIC Corporation's JP 2019-189852 is a patent that specifies the blending of resins, pigments, and solvents based on Hansen's dissolution sphere and interaction radius.
DIC株式会社の、JP 2019-189852は、樹脂、顔料、溶媒の配合をハンセンの溶解球と相互作用半径から規定した特許です。

When I do dissolution or dispersion tests on resins and pigments in various solvents with known HSP, I can find the center of the HSP and the interaction radius of the target. If we draw a sphere of the interaction radius from the center, I can get Hansen's sphere of dissolution.

When the HSP of a certain solvent, pigment, etc. comes within this interaction radius, it is more likely to dissolve in that target.

Web app Sphere Viewer

Drag=Rotate, Drag+Shift key=Magnify, Drag+Command key or Alt key=Move. If you click on a solvent, the name of the solvent will appear.

Drag=回転, Drag+Shift キー=拡大、縮小, Drag+コマンドキーかAltキー=移動。 溶媒をクリックすれば溶媒の名前が現れる。

Some polymers may have hydrophobic and hydrophilic regions.

The DIC patent claims that the relationship is as follows.

The dissolution sphere of the pigment overlaps the dissolution sphere of the hydrophobic region of the resin (Scheme1).
The HSP of the solvent is inside the dissolution sphere of the pigment(Scheme2).
The HSP of the solvent is inside the dissolution sphere of the resinhydrophobic region(Scheme3).
The HSP of the solvent is outside the dissolution sphere of the hydrophilic region of the resin(Scheme4).


Eight pigments, five resins, and four HSP distances (Scheme1-4) were calculated using the prescribed method.

The objective variable is the volume-average particle size (nm), with larger values resulting in coatings with more coarse particles and lower dispersibility.

When using MIRAI to analyze this system, calculations can be performed in two different ways of taking the error.

When using Minimum Absolute Error(MAE), the coefficients are set to try to minimize the sum of the absolute values of the differences between the supervised and calculated values.

Then the ones that are far off will be further off, and the others will come on straighter and straighter.

This is very useful for identifying suspicious data, such as input errors.

Using root mean square error (RMSE), MIRAI try to make something with a large error closer to a straight line.

Assign a small error to each point. When squared, the small error becomes so small that it has no effect.

Ordinary multiple regression analysis also reduces the squared error, but because it cannot take into account nonlinearities and interactions between items, some items will be far off, making the correlation coefficient very bad.

Paint Analysis Example 3


This example will also be patented by DIC. (JPA 2017-031336)
この例もDICの特許になる。(JPA 2017-031336)

The purpose of the patent is to obtain an active energy beam curable composition that can form a cured coating film with both excellent coating film appearance and high antistatic properties.
特許の目的は、「優れた塗膜外観と高い帯電防止性とを両立した硬化塗膜を形成できる活性エネルギー線硬化性組成物 」を得ることになる。

However, when the content of organic solvent is increased, the dispersion state of the metal oxide particles changes. And agglomeration occurs, which causes whitening of the hardened coating film and an increase in its surface resistance value.

Dispersion term (δD) falls in the range of 15.6~16.1MPa0.5.
The polarization term (δP) is in the range of 7.2~9.8MPa0.5.
The hydrogen bonding term (δH) is in the range of 8.2~11.4MPa0.5.
It is claimed that the product is characterized by the inclusion of such organic solvents.
分散項 (δD)が15.6~16.1MPa0.5 の範囲に入る。
分極項(δP)が7.2~9 .8MPa0.5 の範囲に入る。
水素結合項(δH)が8.2~11.4MPa0.5の 範囲に入る。

The actual experiment is summarized as follows.

The areas marked in green are the patent claims. The areas marked in yellow are the limits. Those that fall outside the scope are not colored.

There are a large number of patents that specify the Hansen solubility parameters (HSP) in these ranges.

However, if we plot the results of the four evaluations and each component of HSP, we get a very confusing figures as follows.

dD figure

dP figure

dH figure

So, as a patent claim, it only specifies a range of experiments with four ◯ evaluations.

It is preferable for a patent to be invisible to all but the person who created it.

So let's plot each component of HSP in 3D space and look at it.

Web app Sphere Viewer

Drag=Rotate, Drag+Shift key=Magnify, Drag+Command key or Alt key=Move.

Drag=回転, Drag+Shift キー=拡大、縮小, Drag+コマンドキーかAltキー=移動。

The translucent green sphere is called Hansen's dissolving sphere.

Let's drag with the mouse to check the position.
The spheres marked in red are solvents with all four ratings being ◯.
The balls in blue are solvents with poor ratings.

And the way Hansen Solubility method says it is as follows.
"Solvents that are in the Hansen Solubility Sphere are rated high (4 ◯).
The center of the Hansen sphere is [dD,dP,dH]=[15.9, 8.9, 9.8] and the radius is 2.0.
ハンセンの溶解球の中心は[dD,dP,dH]=[15.9, 8.9, 9.8]で半径は2.0である。

In a patent claim, the scope is cubic, so the range is a little wider.

It is often thought that this is an excellent method and can prevent other companies from entering the market.

However, if we plot the polarization term (δP), we can see that they did not try it in solvents above 9.8MPa0.5.
しかし、分極項(δP)が9 .8MPa0.5 以上の溶媒では試していないことがプロットしてみるとわかる。

HSPiP and MIRAI analysis tools are like two wheels of a wheel.
When we analyze it with MIRAI, we can find out many more things.

Let's analyze it a little deeper.

The HSP of the solvents used in the claim are not for pure solvents, but for mixtures.

In the actual patent, various solvents are used in combination, and the patent is claimed in the range of HSP of the mixed solvents.

The original table can be summarized as follows.

The last two experiments will be used as predictive data.

Four of the physical properties are used as the objective variables. Antistatic ability and surface resistance are particularly important.

Next, for MIRAI, we divide the solvents into groups.

This grouping is a testament to the skill of chemists.

The grouping is determined not only by the functional groups of the solvent, but also by the amount used.

1. alcohol group
2. ester, ether group
3. ketone group
4. other

1. アルコール・グループ
2. エステル、エーテル・グループ
3. ケトン・グループ
4. その他

The coefficients of antistatic ability analyzed by the usual multiple regression method and by MIRAI are included in the table.

As can be seen from the graph, even with the usual multiple regression method, the reproducibility of the solvents used to construct the equations is reasonably high.

The calculated value of the compound for prediction exceeds 6, but for this kind of rank evaluation, a value of 4 or higher is acceptable.

Antistatic Bbility : 帯電防止能

On the other hand, the MIRAI analysis result evaluates the value of 1 as 1.79.
This is due to the fact that even though the solvent composition of the experimental CE12 is very similar to CE1, the evaluation is very different.

Surface Resistance : 表面抵抗

Similarly, for surface resistance, there are two experiments that are outliers in RIRAI.
This is also CE12 and CE1.

Ordinary multiple regression analysis cannot be used for systems where there are interactions between items or nonlinearities in the items. MIRAI is suitable for the analysis of such systems.

Can we say that multiple regression analysis is sufficient for this system?

A very common and big mistake.


For example, CE10 has all rated values of X. Based on this, let's examine the formulation to get a higher rating from the multiple regression coefficients.

Let's change the S1-1 solvent, 0.67g, to 0g and increase the S1-2 solvent by that amount.

Properties Exp. CE10 -MR New Comp. -MR CE10-MIRAI New Comp. -MIRAI
Antistatic 1 1 579.36 0.99 0.94
Surface resistance 39 39 -1092.72 39.04 40.05
Appearance 1.0 1 -22.07 1.0 0.97
Storage stability 1.0 1 1.0 1.02 0.27

物性値 実験値 CE10の組成重回帰 新組成重回帰 CE10の組成MIRAI 新組成MIRAI
帯電防止 1 1 579.36 0.99 0.94
表面抵抗 39 39 -1092.72 39.04 40.05
外観 1.0 1 -22.07 1.0 0.97
保存安定性 1.0 1 1.0 1.02 0.27

A small change in composition can change the results of the multiple regression method by a factor of 500, plus or minus.

In neural networks method, this problem of overlearning is well known.

These variations also occur in multiple regression analysis when the multiple regression coefficients consist of very large ± values.

I am not an expert in statistics. Maybe the statistics will tell me not to use this result.

If we can't use multiple regression analysis, what are we supposed to do about formulation development? What can statistics do for this problem?
重回帰解析が使えないなら、処方開発をどうしろというのか? 統計はこの問題に対して何をしてくれるのか? 

I need a good tool for formulation development, whether it's a neural network method, PLS method, or PCA method.

Maybe I'm not looking hard enough and such a tool already exists. If you know, please let me know.

I'm not good at statistics, but I am good at programming, so I created the MIRAI analysis tool for myself.

Changing the composition of CE10 slightly did not change the results significantly. Only one experimental value of 1 is predicted to be 0.27, but since the result is 1 for X, it is not a problem since all results below 1 are X.

MAGICIAN Training Course > Lecture materials > Formulation top page > Design of paint formulations

Copyright pirika.com since 1999-
Mail: yamahiroXpirika.com (Xを@に置き換えてください) メールの件名は[pirika]で始めてください。
Mail: yamahiroXpirika.com (Replace X with @.) The subject of your email should start with [pirika].