Git Product home page Git Product logo

bdr76 / csvlint Goto Github PK

View Code? Open in Web Editor NEW
134.0 6.0 8.0 11.93 MB

CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting, fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files.

License: GNU General Public License v3.0

C# 97.35% Python 2.65%
csv notepad-plus-plus plugin fixed-width datasets metadata sorting sql syntax-highlighting tabular-data

csvlint's People

Contributors

bdr76 avatar chcg avatar dependabot[bot] avatar fruchtzwerg94 avatar molsonkiko avatar rdipardo avatar shriprem avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

csvlint's Issues

Does not work with Notepad++ 8.4

Hi there

Unfortunately, the plugin does not work with the newest version anymore.
8.3.3 was working, 8.4 not anymore.

cheers
Reto

Unwanted outcome to Reformat - deleted spaces

When using Reformat to add column separators to a text file such as Example.txt, spaces adjacent to a separator are deleted even if “Trim all values” was not selected, and no matter the “Re-apply quotes” option selected (please note that the “Value with spaces” option seems to apply quotes to the full line, rather than quotes framing the values containing spaces).
For Example.txt and the schema.ini metadata below, this results in Example_reformatted_actual.txt.
We would like instead to reformat while keeping all spaces, including those adjacent to a separator (since they are significant in our case), as in Example_reformatted_target.txt.

[Example.txt]
Format=FixedLength
ColNameHeader=False
Col1=FIELD1 Integer Width 4
Col2=FIELD2 Integer Width 4
Col3=FIELD3 Integer Width 3
Col4=FIELD4 Text Width 4
Col5=FIELD5 Text Width 1
Col6=FIELD6 Text Width 2
Col7=FIELD7 Text Width 4

Example.txt
Example_reformatted_actual.txt
Example_reformatted_target.txt

License Regulation

Hi,

This is very silly but I need a clear License Regulation of this software. Otherwise I cant use the plugin at my office.
Can you please add the GNU GPL to your Disclaimer.

This will help me a lot.

Coloring not working in multi-instance mode for separators other than ;

If you activate the multi-instance mode in Notepad++, coloring of columns will only work if you use the semicolon as a seperator. For all other seperators, the columns are still correctly detected, but they are all colored the same (first color). (Tested in Notepad++ v.8.4.9 64-bit portable)

Column Separator Detection Inconsistent

I need to edit a CSV file report that gets generated weekly. This report always has the format laid out the exact same every single time. About 75-80% of the time CSV lint will pick up the formatting without problem. However, once every month or two it will be unable to detect the column separator despite the header row being identical and the data nearly identical. I can't identify any patterns tot distinguish why it picks it up sometimes and why other times it won't.

If it would be useful I'd be happy to include a copy of the file where it does and another where it doesn't pick up the column separator so you can try to figure out where it's getting stuck. I'll tell you now, however, that the header row is always the exact same.

Time,Caller Name,Caller Number,Callee Name,Callee Number,DOD,DID,Call Duration,Talk Duration,Status,Source Trunk,Destination Trunk,Communication Type,PIN Code,Caller IP Address,Recordfile

Minimal quotes, not escaping quotes itself

I got a CSV, Format=Delimited(;), from PowerShell Export-CSV with UseQuotes=AsNeeded (https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/export-csv?view=powershell-7.2) and want to apply the same rule with your reformat feature. So I suppose UseQuotes=AsNeeded translates to Minimal quotes, but it does not escape the quotes itself by double-double-quotes.

I have content to start some executables with arguments in a different column (one arguments column). And it looks like this:
arguments: arg1, arg2 with spaces
will result in CSV to:
...;"arg1 ""arg2 with spaces""";...

If I apply reformat with minimal quotes I get:
...;arg1 "arg2 with spaces";...

First row color formatting when entries have newlines

Hi BdR76! Thanks for a great plugin for npp. I have an issue when I open a .csv file that has newlines embedded in some column entries. It seems that the color formatter is confused about the first row. Minimal working example below. Thanks again!! dpwilt
test2

Feature request/bugfix: Reformat for column width should account for column title width as well

I'm not sure whether or not it's intended behavior, but for some reason right now only the data in the column itself, is taken into account when determining column width for the reformat. This ends up breaking the formatting when the title is longer than the longest data entry in the column, i.e. you have a column title of "tier", and only single-digit integer values. Or any title on an empty column.

If it's a bug, and it makes a difference for reproduction purposes, I'm on Windows 8 (not 8.1) x64, NP++ v8.2 64-bit.

Analyse Data Report not displaying correct range of values

The analysis data report can display an inaccurate range. In the below screen capture you can see the range of 09/08/2022-09/30/2022 but the distinct values displayed show something different. The only reason I see this is because I changed the default setting for unique values from 15 to 35, otherwise I would have thought the range was correct.

Note: The data is in the format of MM/DD/YYYY

20: Date_Field
DataTypes : datetime (5106 = 99.9%)
Width range : 10 characters
DateTime range : 09/08/2022 ~ 09/30/2022
-- Unique values (30) --
n=175 : 09/01/2022
n=174 : 09/02/2022
n=156 : 09/03/2022
n=122 : 09/04/2022
n=115 : 09/05/2022
n=112 : 09/06/2022
n=160 : 09/07/2022
n=183 : 09/08/2022
n=172 : 09/09/2022
n=158 : 09/10/2022
n=110 : 09/11/2022
n=170 : 09/12/2022
n=178 : 09/13/2022
n=174 : 09/14/2022

[bug] CVSLint 0.3: Exception reformatting medicine.cvs to TSV

Notepad++: 32-bit, 8.1.3, dark mode enabled
CVSLint: 32-bit, 0.3

When reformatting medicine.csv to an TSV then an exception is raised.
Should not be caused by invalid data (see validate result in screenshot)

Reproduction:

  1. Reformat medicine.csv to TSV
    image

  2. Exception
    image

Informationen über das Aufrufen von JIT-Debuggen
anstelle dieses Dialogfelds finden Sie am Ende dieser Meldung.

************** Ausnahmetext **************
System.IndexOutOfRangeException: Der Index war außerhalb des Arraybereichs.
   bei CSVLint.CsvEdit.ReformatDataFile(CsvDefinition csvdef, String reformatDatTime, String reformatDecimal, String reformatSeparator, Boolean updateSeparator, Boolean trimAll)
   bei Kbg.NppPluginNET.CsvLintWindow.OnBtnReformat_Click(Object sender, EventArgs e)
   bei System.Windows.Forms.Control.OnClick(EventArgs e)
   bei System.Windows.Forms.Button.OnClick(EventArgs e)
   bei System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
   bei System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
   bei System.Windows.Forms.Control.WndProc(Message& m)
   bei System.Windows.Forms.ButtonBase.WndProc(Message& m)
   bei System.Windows.Forms.Button.WndProc(Message& m)
   bei System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
   bei System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
   bei System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)


************** Geladene Assemblys **************
mscorlib
    Assembly-Version: 4.0.0.0.
    Win32-Version: 4.8.4300.0 built by: NET48REL1LAST_C.
    CodeBase: file:///C:/Windows/Microsoft.NET/Framework/v4.0.30319/mscorlib.dll.
----------------------------------------
CSVLint
    Assembly-Version: 0.3.0.0.
    Win32-Version: 0.3.0.0.
    CodeBase: file:///C:/Extra-Software/Notepad++/plugins/CSVLint/CSVLint.dll.
----------------------------------------
System.Windows.Forms
    Assembly-Version: 4.0.0.0.
    Win32-Version: 4.8.4270.0 built by: NET48REL1LAST_C.
    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/System.Windows.Forms/v4.0_4.0.0.0__b77a5c561934e089/System.Windows.Forms.dll.
----------------------------------------
System
    Assembly-Version: 4.0.0.0.
    Win32-Version: 4.8.4300.0 built by: NET48REL1LAST_C.
    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/System/v4.0_4.0.0.0__b77a5c561934e089/System.dll.
----------------------------------------
System.Drawing
    Assembly-Version: 4.0.0.0.
    Win32-Version: 4.8.4084.0 built by: NET48REL1.
    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/System.Drawing/v4.0_4.0.0.0__b03f5f7f11d50a3a/System.Drawing.dll.
----------------------------------------
System.Core
    Assembly-Version: 4.0.0.0.
    Win32-Version: 4.8.4320.0 built by: NET48REL1LAST_C.
    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/System.Core/v4.0_4.0.0.0__b77a5c561934e089/System.Core.dll.
----------------------------------------
System.Configuration
    Assembly-Version: 4.0.0.0.
    Win32-Version: 4.8.4190.0 built by: NET48REL1LAST_B.
    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/System.Configuration/v4.0_4.0.0.0__b03f5f7f11d50a3a/System.Configuration.dll.
----------------------------------------
System.Xml
    Assembly-Version: 4.0.0.0.
    Win32-Version: 4.8.4084.0 built by: NET48REL1.
    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/System.Xml/v4.0_4.0.0.0__b77a5c561934e089/System.Xml.dll.
----------------------------------------
System.Windows.Forms.resources
    Assembly-Version: 4.0.0.0.
    Win32-Version: 4.8.4084.0 built by: NET48REL1.
    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/System.Windows.Forms.resources/v4.0_4.0.0.0_de_b77a5c561934e089/System.Windows.Forms.resources.dll.
----------------------------------------
System.resources
    Assembly-Version: 4.0.0.0.
    Win32-Version: 4.8.4084.0 built by: NET48REL1.
    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/System.resources/v4.0_4.0.0.0_de_b77a5c561934e089/System.resources.dll.
----------------------------------------
Accessibility
    Assembly-Version: 4.0.0.0.
    Win32-Version: 4.8.4084.0 built by: NET48REL1.
    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/Accessibility/v4.0_4.0.0.0__b03f5f7f11d50a3a/Accessibility.dll.
----------------------------------------
mscorlib.resources
    Assembly-Version: 4.0.0.0.
    Win32-Version: 4.8.4084.0 built by: NET48REL1.
    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/mscorlib.resources/v4.0_4.0.0.0_de_b77a5c561934e089/mscorlib.resources.dll.
----------------------------------------

************** JIT-Debuggen **************
Um das JIT-Debuggen (Just-In-Time) zu aktivieren, muss in der
Konfigurationsdatei der Anwendung oder des Computers
(machine.config) der jitDebugging-Wert im Abschnitt system.windows.forms festgelegt werden.
Die Anwendung muss mit aktiviertem Debuggen kompiliert werden.

Zum Beispiel:

<configuration>
    <system.windows.forms jitDebugging="true" />
</configuration>

Wenn das JIT-Debuggen aktiviert ist, werden alle nicht behandelten
Ausnahmen an den JIT-Debugger gesendet, der auf dem
Computer registriert ist, und nicht in diesem Dialogfeld behandelt.

Not working for some valid CSV files

I'm attaching a couple of valid CSV files that CSV Lint is telling me have only one column:

dummy4.csv
dummy3.csv

MS Excel opens them correctly, without complaint, as do all of the other CSV-aware programs I could find. I though the issue was that the header row didn't have the field names quoted, but neither quoting the header field names nor removing the header row entirely resolved the issue. I also tried removing the empty field at the end of the records (and its header field) but that didn't help, either.

[EDIT: I'm using v0.4.5.2 in NPP v8.4.4 (64-bit)]
[EDIT2: I discovered that there is a newer version, v0.4.5.5β, so I installed and tried it, but this issue persists.]

[Feature request] Support for Long (BigInt) numbers

Hi,
Thanks for sharing this plugin - It has proven really useful several times.
Would it be possible for the plugin to detect long ints (or BigInts) when validating data?
For instance, when validating this sample csv data:

Col1;Col2;Col3
19097;170456;TEXT1
19097;18661704;TEXT2
16084;7405017000001464;TEXT3

It informs of an invalid value:

** error line 4: Column 2 value "7405017000001464" not a valid integer value
Inspected 4 lines, 1 data errors found, time elapsed 00:00:00.000

I know it is not a valid integer but it is a valid long integer though... Perhaps we could have an option in the plugin´s settings to set whether we work with ints ot bigints? Or just directly accept bigints...

Thanks.

[feature request] Option to disable/enable plugin

Sometimes I load quite huge CSV files (200.000+ lines, 40+ columns) and do some searches and operations on them.
When CSVLint is installed then paging is heavily delayed, e.g. jumping to the end of one of those CSVs via CTRL+END takes over 85 seconds. While jumping to the start with CTRL+POS1 is done in 1 second.

For these cases I would like to temporary disable CSVLint to work faster on thoise CSVs.
Right now I have to remove the CSVLint plugin and reinstall it afterwards.
Having an option to disable it would make life more easier.

Thanks for your time
M. Bücher

Option to change font?

Just wondering if it would be possible to give the user the option to change the front of the CSV Lint Window to something other than Courier? Thanks!

Feature Request: Boolean Column Datatype or Enum/Coded values

Hello. First off I'd like to say thank you for creating such a wonderful NP++ plugin!

The title pretty much says it all. I have several CSV files containing boolean columns, but CSVLint either detects them as Integer (1/0) or Text (true/false) datatypes. Could a new Boolean datatype be added as well, please?

Examples of some valid booleans (English):

True:

1
ENABLED
Enabled
enabled
ON
On
on
TRUE
True
true
YES
Yes
yes
Y
y

False:

0
DISABLED
Disabled
disabled
OFF
Off
off
FALSE
False
false
NO
No
no
N
n

Alternatively, instead of hard-coding a predefined set of common booleans, it might be more preferable to allow the user to specify valid "true" and "false" synonyms. Regular expressions could handle this well.

For example, to match the synonyms I provided above, these regular expressions would work:

True: ^(1|ENABLED|[Ee]nabled|ON|[Oo]n|TRUE|[Tt]rue|YES|[Yy](es)?)$
False: ^(0|DISABLED|[Dd]isabled|OFF|[Oo]ff|FALSE|[Ff]alse|NO|[Nn]o?)$

Thank you and best regards.

Remove a column

Once the format of the file has been detected, you can add / split / search & replace by column, but I couldn't find an option to remove the column. For columns with varying length text, or columns at the end of the lines, they don't line up well to use Notepad's Column Mode selection tool. The Reformat --> Align Vertically, then a column select and delete is helpful for this, but it would be cleaner to have a simple option of just removing the column.

Thanks for a helpful tool!

Feature Request: Manually specify Header Row/Separator character when detecting metadata

Currently when there is no difference in data types between header and data in any column, CSVLint assumes there is no header row. While this is completely understandable, it would be beneficial to be able to specify that a header row exists, assuming that this behavior is intended. A checkbox in the CSVLint window would be most preferable in my mind but even a toggle in the plugin menu would be very helpful.

ColNameHeader=False

Job Name, GUID, Error Message, Run in Audit Mode
"CONTACT_UPDATE","1234567890abcdef","Failed to process; reasons","No"

ColNameHeader=True

Job Name, GUID, Error Message, Run in Audit Mode,Thing
"CONTACT_UPDATE","1234567890abcdef","Failed to process; reasons","No","1"

Windows 10 21H2 x64, NPP 8.4.1 32-bit, CSVLint 0.4.5.1 in case it matters. Thanks for making such a handy plugin!

No "Detect Columns" option available

I don't have a "detect columns" option in CSV Lint. I was trying to get CSV Lint to recognize a CSV file and was unable to get it to detect the columns in said file. I eventually figured out that specific issue was caused by the bug mentioned in #36, but during the process of trying to get it to detect my file I realized I don't have any option to detect columns. Images attached.

image
image

Autodetection of columns not working for a more-than-one-row header

Hi,

I miss a setting were the header can be more complex than just one row. Many data logs have a header containing additional information spreading over more than one line and then starting with the actual header for the column names.
I have three ideas how to implement it:

  1. Sometimes the header is distinguishable by a first char (example: ~).
  2. If the header cannot determined automatically there could be a setting which provides the row of the columns starting. (in this example Line 13, with the data starting in line 14)
  3. Another way could be to start the autodetection not in the beginning of the data but in the first couple of not empty lines from the end of a file.

example file (I hope the data is not altered by the editor, the data here is TAB delimited):
~Resultfile from Basytec Battery Test System
~Date and Time of Data Converting: 10.11.2022 12:59:26
~
~Name of Test: Test Battery xyz
~Battery: LI-123_yx
~Testplan: LI-123_yx-Test.pln
~Testchannel: 1054 CH11 CTS
~Start of Test: 10.11.2022 10:38:59
~End of Test: 10.11.2022 12:52:38
~Operator (Test): justme
~Operator (Data converting): justme
~
~Time[h] DataSet t-Step[h] t-Set[h] Line Command U[V] I[A] Ah[Ah] Ah-Charge Ah-Discharge Ah-Step Wh[Wh] T1[°C] R-AC R-DC Climate-T Cyc-Count Count State
0 1 0 0 2 Pause 4.14033592686097 0 0 0 0 0 0 42.96556 0 0 43 1 1 3
5.5E-7 2 5.5E-7 5.56111111111111E-7 2 Pause 4.14033592686097 0 0 0 0 0 0 42.96556 0 0 43 1 1 0
0.0002777775 3 0.0002777775 0.0002777775 2 Pause 4.14033592686097 0 0 0 0 0 0 42.96907 0 0 43 1 1 2

Downloading 0.4.5 release. Fail to unpack

I tried to download the 0.4.5 (x64) release from Github a couple of times. Version 0.4.4 (x64) works well.
When I try to unpack either 7zip or Windows explorer shows an error:

image
image

column split doesn't quote strings containing separator char in new columns

I was playing around with the column split feature, and I noticed that while it works as advertised in general, there's a minor bug where strings containing the separator char are not enquoted in the new CSV produced by the column split feature.
This seems pretty nitpicky, so probably a low priority to address. Thanks for making an awesome plugin!

For example, I have a CSV file:
silly_example before dummy split by cities
If I split the "cities" column using the "decode multiple value" option with the values "FUDG;BUS;GOLAR;YUNOB;MOKJI" I get
silly_example after dummy split by cities
Since the separator char is present in one of the new columns, we get some weirdness.

A similar bug occurs whenever I use any of the other column split options (e.g., split on position 3, split on character "t"); values containing the separator char are not wrapped in quotes.
If I choose not to discard the original column, the quotes stay in place in the original column but not the generated columns:
silly_example after dummy split by cities keep og column

Number of columns with quotes not valid

First of all, thanks and congratulations for the helpful and great plugin.

I found an issue when validating columns that contain quoted delimiter-characters,

Quote character is: " (double-quotes)
Delimiter character is: , (comma)

This is my validation config:

Format=CSVDelimited
ColNameHeader=True
Col1=a Text Width 10
Col2=b Text Width 10
Col3=c Text Width 10

This is my test data:

a,b,c
a1,a2,a3
"a1","a,2","a3"
"a1","a2","a3"

The 3rd line throws an validation error, although the 2nd field "a,2" is quoted validly.

This is the error-message:

** error line 3: Too many columns
Inspected 4 lines, 1 data errors found, time elapsed 00:00:00.000

Would be great if you could look into this.

Thanks & best regards,
Bernhard

Unable to add this into Notepadd ++

Hi,

I am trying to add this as plugin into 64 bit Notepad++ version 8.1.4. Getting the below popup when launching Notepad ++.

Can you please help in resolving.

Thanks
Sam
Error Msg

It stops responding when I try to use reformatting

Hi,

It stops responding when I try to use reformatting, or split column. Highlighting works without a problem, then I go to CSV Lint window and still no problem, it shows the meta data and shows the errors in data, but when I click the errors, it again stops responding (the whole Np++)

What makes me wonder is that every feature worked perfectly in my first tries. 30 mins later (or maybe when I close and reopen it) they started not working.

Thanks a lot,
Gustav

Feature Request, auto format/unformat when loading and saving a file

First off, I LOVE this plugin! I have to edit csv files regularly and this is hugely helpful! The 1 feature I'd LOVE to see is an auto-format/auto-unformat option whereby when you open a csv file it automatically applies the format options and upon saving it undoes the formatting options. That way the user couldn't mess up the file by formatting it a specific way but still have the advantages of formatting while working with it!

Make Reformat keep the quotes as-is like in the original file

Hi,
Thank you for working on this plugin for Notepad++.

I'm trying to edit CSV files that aren’t compliant with RFC 4180 CSV rules, and I can’t change their format. They have some rows of differing lengths, and some fields are enclosed in quotes even though they don’t strictly have to be according to the standard. The files also have 4 header rows. I’ve been searching for an editor that will allow me to move some text to the right (by adding extra delimiters) so that info in the header is kept when I delete columns, and then align the delimiters so I can delete some columns, WITHOUT changing the formatting of the CSV file. I don’t know that behaviour could be added as options to this plugin?

I’ve tried LOTS of editors and only found one commercial piece of software that does what I need. I've been surprised how difficult it's been to find a way of doing this-spreadsheets mangle the quotes, and I can't find a text editor that will properly align the delimiters and keep the quotes.

What I'm doing is importing the file (which doesn't natively have a .csv extension). I'm then using the manual 'Detect Columns' to specify the delimiter. Ideally the plugin would be smart enough to not detect delimiters within quotes.

My next step is to use the 'Reformat' and 'Align vertically' options, at which point all the quote characters are removed from the file. None of the 'Re-apply quotes' options put them back everywhere they were. An option to not remove any quotes would be great.
Datalogger files often have more than one line of headers as they store other info on the configuration. These files have 4 lines of headers, and the 'Align vertically' doesn't add enough space to all rows that they are properly aligned. #46 may relate to this (Autodetection of columns not working for more-than-one-row header).

Here's a screenshot showing the removal of quotes and vertical alignment issues:
2023-03-31 10_23_32-CSV_Editors_CSV_Lint_inNotepad++_AlignmentIssues

What I'd hoped to do as the final steps was to use Notepad++'s multi-row editing to delete columns (although I see there is an enhancement to allow that in CSV-lint #54).

Last step was to be to remove the 'Align vertical', and save the modified file.

I've attached a sample file, and what I'd like it to look like after editing (these both have the file extension changed to .csv so I can upload them here.

BW,
A

CSV_Lint_inNotepad++_ExampleDataFile__2023_02_28AB1.csv

CSV_Lint_inNotepad++_ExampleDataFile_Edited__2023_03_30AB1.csv

Color doesn't work when changing View

When you move a CSV to "other view" using the right click option on the title of the CSV in NotePad++, the color is not applied correctly on the CSV.

If you move back to the 1st view the CSV the color works correctly again.

Automatically detect background color

Hi,

since I always need to change the settings of the plugin after changing the theme of Notepad++ to apply the right colors, autodetecting this would be aweseome.
Since I am not deep enought in the code of the plugin I don't know where and how to do this exactly here but you may want to have a look in my plugin PlantUmlViewer which has a similar feature implemented.

The following snippets where the solution for me:

  1. Detect the change of the editors background color
    PlantUmlViewer PlantUmlViewer.cs L161
public void OnNotification(ScNotification notification)
{
    //NPPN_DARKMODECHANGED or NPPN_WORDSTYLESUPDATED
    if (notification.Header.Code == (uint)NppMsg.NPPN_FIRST + 27
        || notification.Header.Code == (uint)NppMsg.NPPN_WORDSTYLESUPDATED)
    {
        UpdateStyle();
    }
}
  1. Get the background color
    PlantUmlViewer PlantUmlViewer.cs L168
IntPtr editorBachgroundColorPtr = Win32.SendMessage(PluginBase.nppData._nppHandle,
    (uint)NppMsg.NPPM_GETEDITORDEFAULTBACKGROUNDCOLOR, 0, 0);
int bbggrr = editorBachgroundColorPtr.ToInt32();
Color editorBackgroundColor = Color.FromArgb(bbggrr & 0x0000FF, (bbggrr & 0x00FF00) >> 8, (bbggrr & 0xFF0000) >> 16);
  1. Determine if dark or light
    PlantUmlViewer PreviewWindow.cs L155
bool newIsLight = editorBackgroundColor.GetBrightness() > 0.4;

Would be a great feature for this plugin as well if applicable.

Coloring of CSV documents not optimal

Description

This issue has 2 aspects. If you decide to solve number 1. then number 2. is automatically obsolet.

  1. When placing the caret to a line of a CSV document, all color markings of CSV columns disappear. The background color is changed to Npp's color for highlighting the current line (where the caret is placed). This reduces the usefulness of the plugin. It would be more helpful to use the foreground color (i.e. the color of the characters) for color marking of CSV columns. This way the line where the caret is placed would still contain color markings.

  2. The default color set for coloring column content of CSV documents contains for the 1st, 9th, ... column a color which is nearly identical to Npp's color for highlighting the current line (where the caret is placed). Thus it is nearly impossible to distinguish the current line from that said columns.

Applies to

Plugin version 0.4.1

Reformat csv to fixed width: header names are always removed

(I almost posted this topic on the NPP-Plugin forum https://community.notepad-plus-plus.org/post/85307 , but I was told to post it here)

I tried to reformat a CSV with semicolons as separators to a "Fixed Width"

a) Problem: my header line was totally removed
b) Suggestion: It removed also all separators, and now I have a long "stringwereallpartsare" concatenated. I think the feature in UltraEdit to keep the separators is a fine thing.

Images / screenshots can be seen in the NPP-forum - link in first line.

Thanks and regards!

CSVLint 0.4.5.1 crashes Notepad++ with a CSV file open and running `editor.getLexerLanguage()` in PythonScript console

This issue is very similar to issue #25. Very likely, both these issues have a common source of error and a common fix. Apologies for posting this as a separate issue.

Steps to Replicate

  1. Open PythonScript console.
  2. Open a few files of standard language types such as .xml, .cpp, .py, etc.
  3. With each of these file types, execute the following command in console to query the name of the lexer: editor.getLexerLanguage(). You will promptly see the name of the lexer reported back in the PythonScript console.
  4. Now, open a CSV file in Notepad++ 8.4.1. The CSV file is colorized as expected by the CSVLint plugin.
  5. Execute the command in PythonScript console to query the name of the lexer for the CSV file open in the editor: editor.getLexerLanguage().
  6. Notepad++ will crash immediately and silently, with no messages. The screenshot below was taken just prior to pressing ENTER with the editor.getLexerLanguage() command.

image

Here is the DebugInfo for the NPP 8.4.1 instance that crashed, with CSVLint 0.4.5.1 and PythonScript 2.0.0.0.

Notepad++ v8.4.1   (64-bit)
Build time : May  8 2022 - 19:51:18
Path : E:\Downloads\NPP\npp.8.4.1.portable.x64\notepad++.exe
Command Line : 
Admin mode : OFF
Local Conf mode : ON
Cloud Config : OFF
OS Name : Windows 11 (64-bit) 
OS Version : 2009
OS Build : 22000.675
Current ANSI codepage : 1252
Plugins : CSVLint.dll PythonScript.dll 

No Crash with CSVLint 0.4.5 in NPP 8.3.3

For comparison, with the same steps describe above, there were no crashes with CSVLint 0.4.5. Of course, CSVLint 0.4.5 can work only on NPP 8.3.3 or prior.
image

And, the DebugInfo for when there were no crashes:

Notepad++ v8.3.3   (64-bit)
Build time : Mar 13 2022 - 17:20:02
Path : E:\Downloads\NPP\npp Archive\npp.8.3.3.portable.x64\notepad++.exe
Command Line : 
Admin mode : OFF
Local Conf mode : ON
Cloud Config : OFF
OS Name : Windows 11 (64-bit) 
OS Version : 2009
OS Build : 22000.675
Current ANSI codepage : 1252
Plugins : CSVLint.dll FWDataViz.dll mimeTools.dll NppConverter.dll NppExport.dll PythonScript.dll 

Some Novice Debugging

I did some debugging with your code repo in my Visual Studio. The crash started with the update for lexer5 commit. In particular, I tried commenting out the single significant line of code in the static IntPtr CreateLexer(IntPtr pName) function that was added in this commit.

        [DllExport(CallingConvention = CallingConvention.StdCall)]
        static IntPtr CreateLexer(IntPtr pName)
        {
            // function will be called by scintilla
            // Required for Notepad++ update from iLexer4 -> iLexer5

            string sName = Marshal.PtrToStringAnsi(pName);

            if (sName == ILexer.Name.Trim('\0'))
            {
                //return ILexer.ILexerImplementation();
            }
            return IntPtr.Zero;
        }

With the DLL built after commenting out the return ILexer.ILexerImplementation(); line, the crashes don't occur. But of course, there will be no colourization of the CSV file. And, the lexer name is reported as null.

I am not that familiar with C# and .NET technologies. But it appears to me that the CreateLexer() was only the first step needed for the Scintilla5/Lexilla5 upgrade in NPP. And, very likely, the ILexer.ILexerImplementation() method in your code may also need some tweaking for the Lexilla5 upgrade.

How to work with text selection?

It seems that the commands (analyse, reformat, .) are always using the entire file. Is it possible to reduce the functions to seekcted text only?

Option to choose smallest possible numeric types

One feature that would be very nice to include would be to (optionally) automatically calculate the smallest numeric type necessary for a column. Probably all floating point values should be stored as doubles or decimals, to avoid loss of precision, but AFAIK pandas and most DBMS don't automatically determine the smallest integer type that could be used for a column.

For example, with this option active, maybe the Generate metadata form making a Python script would specify np.int32 for columns with no values outside the range (-2**31, 2**31 - 1), np.int64 for integers in the range (-2**63, 2**63 - 1), decimal for really huge integers, and so on and so forth.

I can see downsides for this, especially if you don't have any particular reason to believe that the dataset author won't throw some anomalous data with really big/small values at you in the future. I can also see why maybe it doesn't matter that much unless you're using CSVLint to preview a very large dataset.

Validate Data does not recognize scientific number format

Hi,

it seems the classification as FLOAT does not recognize scientific number formats like 1.24E-3.
(just for info: 1.24E-3 = 1.24*10^(-3) = 0.00124)
** error line 2746: Column 3 value "5.55462962962963E-7" not a valid decimal value

It would be nice if it was supported or at least excluded as an error during validation.

Thanks

CCVLint 0.4.5 not compatible with Notepad++ 8.4 32-bit

Trying to add CSVLint 0.4.5 through Plugins Admin fails with the following message:

Loading CreateLexer function failed.
CSVLint.dll is not compatible with the current version of Notepad++

Notepad++ version is 8.4 32-bit
OS is Windows 10 20H2 (OS Build 19042.1645)

Error message when Notepad++ starts up after plugin's installation

Description

When Notepad++ starts up after installing the plugin, the following error message is displayed:

grafik

Analysis

The reason is that the plugin's ZIP package contains the plugin's DLL file and the missing file CSVLint.xml as well. Thus, during plugin installation both files are copied to the plugin's directory. The plugin installer of Notepad++ is not able to copy the file CSVLint.xml to the correct location.

Possible Fix

The file CSVLint.xml should not be included in the plugin's ZIP package. Instead it should be written by the plugin when it starts up and notices that the file is missing.

Applies to

Plugin version 0.4.1

Loading CreateLexer function failed.

After installing the latest Notepad++, I receive an error:

Loading CreateLexer function failed.
CSVLint.dll is not compatible with the current version of Notepad++.
Do you want to remove this plugin from the plugins directory to prevent this message from the next launch?

Yes    |     No

220706_6s

Current version:
image

Request: Sort by Column(s)

Apologies in advance if entering a request as an "issue" is bad manners - I could not see anywhere else to do this.
This is a really useful plug-in for quickly inspecting CSV files for issues (especially trailing zeros that excel always hides/ignores).
It would be awesome if you could add the below to your to-do list...

  • Sort by data by one or more Column(s)

Thanks!
TEG

Plugin's toolbar button and its menu entry "CSV Lint window" don't toggle visibility of plugin's window

Description

When clicking the plugin's toolbar button or its menu entry CSV Lint window, the plugin's docked window gets visible. But it is not possible to click the toolbar button respectively the menu entry to make it invisible again.

Expected Behaviour

Clicking the toolbar button or the menu entry should toggle the visibility of the plugin's window.

Actual behaviour

After the plugin's window has been made visible, neither clicking the toolbar button nor the menu entry toggles the visibility of the window.

Applies to

Plugin version 0.4.1

Highlight update and multiline column value

When editing a CSV file with multi-line text (separator is "), if I edit the second line or following of this column, then the highlighting update process acts as if line being edited is a new line with first column described from first character.

Before edition:
image

After edition of 3rd line:
image

The only way to circumvent this issue, to my knowledge, is to save file after editing and reload it.

CSVLint crashes Notepad++ with a CSV file open and running `editor.getProperty('qwerty')` in PythonScript console

Using

Notepad++ version 8.4.1 with only these two plugins installed:

  1. CSVLint version 0.4.5.1
  2. PythonScript version 2.0.0.0

Steps to Replicate

  1. Open a CSV file in Notepad++ 8.4.1. The CSV file is colorized as expected by the CSVLint plugin.
  2. Open PythonScript console.
  3. Enter a command in console to set a property value. Example: editor.setProperty('TESTING', 'Hello, World!')
  4. Now, enter a command in the console to query that property value. Example: editor.getProperty('TESTING')
  5. Notepad++ will crash with no messages. The screenshot below was taken just prior to pressing ENTER with the editor.getProperty('TESTING') command.

image

Additional Info

This issue does not occur when tested with Notepad++ version 8.3.3, CSVLint v0.4.5, PythonScript v2.0.0.0.

The screenshot when similar steps work fine with the prior version of CSVLint and Notepad++.

image

You will notice that with Notepad++ version 8.3.3 & CSVLint v0.4.5, I also have the Fixed Width Data Visualizer plugin side panel open. In fact, I am the author of the FWDataViz plugin.

Recently, it came to my notice that in Notepad++ v8.4.1, with the FWDataViz side panel open, if I opened a CSV file, NPP would crash. I did some debugging with my plugin's source code and determined that the cause for the crash was at the point when my plugin issued a SCI_GETPROPERTY SendMessage to Scintilla. To confirm that the plugin at issue is CSVLint, I was able to reproduce the crash without the FWDataViz plugin, and with just the PythonScript plugin by issuing the editor.getProperty('TESTING') command while a CSV file was open in the editor.

Also, this crash does not occur when a different file type than a CSV file is the active document in the NPP editor when I issue the editor.getProperty('TESTING') command. In the screenshot below note that a CSV file is actually open in the first tab.

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.