I wrote a function to take a number of files I wanted hashed as an array of fully qualified file paths. The function then iterates over the array to build a giant PowerShell command containing all the paths, parses the output of PowerShell and returns a dictionary with each of the fully qualified files as an index and it’s hash as the value.
It’s supposed to be a black box into which you pass an array of files and receive a dictionary of all those files and their hashes. It works well, but I did discover a limit. I got an error when trying to send a text string that was too big to fit into the buffer to run the command. So I had to update my function to check how big the string was getting. When it got too large, I’d run the previous string to get those files, then restart the string with the file that was likely to crash the command, and continue looping until the entire array was processed.
I thought I’d include my dirty programming here if you dare to look at it. It might turn you to stone like looking at Medusa without a mirror! It’s that horrid right now. I am going to refactor it, but shipped it as is so my customer could continue their work.
Function PS_GetFileHashes(aFiles As Variant, sHashAlgorithm As String) As Scripting.Dictionary
Const PS_Cmd_Beg As String = "Get-ChildItem -Path "
Const PS_Cmd_Mid As String = " -File | Get-FileHash -Algorithm "
Const PS_Cmd_End As String = " | ForEach { $_.Path + '|' + $_.Hash }"
Dim t As New HAL_TickCounter: t.Init
Dim LenCmd As Long, aIdx As Long
t.Elapsed "GetFileHashes Begin"
Dim retVal As New Scripting.Dictionary
Dim sPSCmd As String, itm As Variant, sPathsPart As String, sPreviousPart As String, aOutput As Variant, itmParts As Variant
On Error GoTo Error_Handler
Select Case sHashAlgorithm
Case "MACTripleDES", "MD5", "RIPEMD160", "SHA1", "SHA256", "SHA384", "SHA512"
Case Else
MsgBox "Unknown Algorithm", vbCritical Or vbOKOnly, "Operation Aborted"
GoTo Error_Handler_Exit
End Select
LenCmd = Len(PS_Cmd_Beg & PS_Cmd_Mid & sHashAlgorithm & PS_Cmd_End)
' Need to split aFiles into manageable segments. 515 files overloaded the PS_GetOutput in this case
' Maybe get every 50 - 100 files at a time.
' Looks like 32656 characters is near the limit, 32766 is over the limit 32768 is
aIdx = LBound(aFiles)
Do While aIdx <= UBound(aFiles)
sPathsPart = ""
Do
sPreviousPart = sPathsPart
sPathsPart = sPathsPart & HALutil.WrapText(aFiles(aIdx), "'") & ","
aIdx = aIdx + 1
If aIdx >= UBound(aFiles) Then Exit Do
Loop Until Len(sPathsPart) - 1 + LenCmd > 32656
' We left because either we reached the end of the array or we exceeded the limit.
' If we exceeded the limit then we need to back one element off of sPathsPart and the aIdx
If Len(sPathsPart) - 1 + LenCmd > 32656 Then
aIdx = aIdx - 1
sPathsPart = sPreviousPart
End If
'Remove the comma from the end if there, if empty, use *.*
If Len(sPathsPart) > 0 Then sPathsPart = Left(sPathsPart, Len(sPathsPart) - 1) Else sPathsPart = "'*.*'"
' Build the PowerShell command string
sPSCmd = PS_Cmd_Beg & sPathsPart & PS_Cmd_Mid & sHashAlgorithm & PS_Cmd_End
t.Elapsed "GetFileHashes Start GetOutput"
aOutput = Split(CStr(PS_GetOutput(sPSCmd)), vbCrLf)
t.Elapsed "GetFileHashes End GetOutput"
' Add the output for filenames / hashes to the dictionary to return
For Each itm In aOutput
itmParts = Split(itm, "|")
If UBound(itmParts) > 0 Then retVal.Add itmParts(0), itmParts(1)
Next itm
Loop
t.Elapsed "GetFileHashes End"
Set PS_GetFileHashes = retVal
Error_Handler_Exit:
On Error Resume Next
Exit Function
'Get-ChildItem -Path '.\*.xls','.\*.csv' -Recurse -File | Get-FileHash -Algorithm SHA256
Error_Handler:
MsgBox "The following error has occured" & vbCrLf & vbCrLf & _
"Error Number: " & Err.Number & vbCrLf & _
"Error Source: PS_GetFileHash" & vbCrLf & _
"Error Description: " & Err.Description & _
Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _
, vbOKOnly + vbCritical, "An Error has Occured!"
Resume Next
Resume Error_Handler_Exit
End Function
Maybe as a “fun” exercise, YOU could refactor the function for me. If so, I’ll credit you in a later message.
For now, I have a few things in there you don’t really need to track timings and you should also get a copy of the PS_GetOutput command from here: DevHut’s Daniel Pineault’s VBA PowerShell Function with Update | Access JumpStart. Or back on DevHut.
Trackbacks/Pingbacks